public inbox for [email protected]
help / color / mirror / Atom feedRe: Better shared data structure management and resizable shared data structures
75+ messages / 9 participants
[nested] [flat]
* Re: Better shared data structure management and resizable shared data structures
@ 2026-02-23 14:14 Ashutosh Bapat <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-02-23 14:14 UTC (permalink / raw)
To: Andres Freund <[email protected]>; Heikki Linnakangas <[email protected]>; +Cc: pgsql-hackers; [email protected]
On Wed, Feb 18, 2026 at 9:17 PM Ashutosh Bapat
<[email protected]> wrote:
> > > 4. the address and length passed to madvise needs to be page aligned,
> > > but that passed to fallocate() needn't be. `man fallocate` says
> > > "Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux
> > > 2.6.38) in mode deallocates space (i.e., creates a hole) in the byte
> > > range starting at offset and continuing for len bytes. Within the
> > > specified range, partial filesystem blocks are zeroed, and whole
> > > filesystem blocks are removed from the file.". It seems to be
> > > automatically taking care of the page size. So using fallocate()
> > > simplifies logic. Further `man madvise` says "but since Linux 3.5, any
> > > filesystem which supports the fallocate(2) FALLOC_FL_PUNCH_HOLE mode
> > > also supports MADV_REMOVE." fallocate with FALLOC_FL_PUNCH_HOLE is
> > > guaranteed to be available on a system which supports MADV_REMOVE.
> >
> > I think it makes no sense to support resizing below page size
> > granularity. What's the point of doing that?
> >
>
> No point really. But we can not control the extensions which want to
> specify a maximum size smaller than a page size. They wouldn't know
> what page size the underlying machine will have, especially with huge
> pages which have a wide range of sizes. Even in the case of shared
> buffers, a value of max_shared_buffers may cause buffer blocks to span
> pages but other structures may fit a page.
>
> In the attached patches, if a resizable structure is such that its
> max_size is smaller than a page size, it is treated as a fixed
> structure with size = max_size. Any request to resize such structures
> will simply update the metadata without actual madvise operation. Only
> the structures whose max_size > page_size would be treated as truly
> resizable and will use madvise. You bring another interesting point.
> If a resizable structure has a maximum size higher than the page size,
> but it is allocated such that the initial part of it is on a partially
> allocated page and the last part of it is on another partially
> allocated page, those pages are never freed because of adjoining
> structures. Per the logic in the attached patches, all the fixed (or
> pseudo-resizable structures) are packed together. The resizable
> structures start on a page boundary and their max_sizes are adjusted
> to be page aligned. That way we can release pages when the structure
> shrinks more than a page.
It was a mistake on my part to assume that more memory will be freed
if we page align the start and end of a resizable structure. I didn't
account for the memory wasted in alignment itself. That amount comes
out to be same as the amount of memory wasted if we don't page align
the structure. But the code is simpler if we don't page align the
structure as seen in the attached patches.
> > >
> > > > Using fallocate() (or madvise()) to free memory, we don't need
> > > > multiple segments. So much less code churn compared to the multiple
> > > > mappings approach. However, there is one drawback. In the multiple
> > > > mapping approach access beyond the current size of the structure would
> > > > result in segfault or bus error. But in the fallocate/madvise approach
> > > > such an access does not cause a crash. A write beyond the pages that
> > > > fit the current size of the structure causes more memory to be
> > > > allocated silently. A read returns 0s. So, there's a possibility that
> > > > bugs in size calculations might go unnoticed. I think that's how it
> > > > works even today, access in the yet un-allocated part of the shared
> > > > memory will simply go unnoticed.
> > >
> > > If that's something you care about, you can mprotect(PROT_NONE) the relevant
> > > regions.
> >
> > I am fine, if we let go of this protection while getting rid of
> > multiple segments, if we all agree to do so.
> >
> > I could be wrong, but mprotect needs to be executed in every backend
> > where the memory is mapped and then a new backend needs to inherit it
> > from the postmaster. Makes resizing complex since it has to touch
> > every backend. So avoiding mprotect is better.
I discussed this point with Andres offlist. Here's a summary of that
discussion. Any serious users of resizable shared memory structures
would need to send proc signal barriers to synchronize the resizing
across the backends. This barrier can be used to perform mprotect() in
the backends and a separate signal to Postmaster, if mprotect is
needed in Postmaster. But whether mprotect is needed depends upon the
usecase. It should be responsibility of the resizable structure user
and not of the ShmemResizeRegistered()
Following points need a bit of discussion.
1. calculation of allocated_size
For fixed sized shared memory structures, allocated_size is the size
of the structure after cache aligning it. Assuming that the shared
memory is allcoated in pages, this also is the actually memory
allocated to the structure when the whole structure is written to. For
resizable structure, it's a bit more complicated. We allocate and
reserve the maximum space required by the structure. At a given point
in time, the memory page where the next structure begins and the page
which contains the end of the structure at that point in time are
allocated. The pages in-between are not allocated. Thus the
allocated_size should be the length from the start the structure to
the end of the page containing the current end of the structure + part
of the page where the next structure starts upto the start of the next
structure. That is what is implemented in the attached patches.
2. GUCs shared_memory_size, shared_memory_size_in_huge_pages
These GUCs indicate the size of the shared memory in bytes and in huge
pages. Without resizable shared memory structures calculating these is
straight forward, we sum all the sizes of all the requested
structures. With resizable shared memory structures, these GUCs do not
make much sense. Since the memory allocated to the resizable
structures can be anywhere between 0 to maximum, neither the sum of
the their initial sizes nor the sum of their maximum sizes can be
reported as shared_memory_size. Similarly for
shared_memory_size_in_huge_pages. We need two GUCs to replace each of
the existing GUCs - max_shared_memory_size, initial_shared_memory_size
and their huge page peers. max_shared_memory_size is the sum of the
maximum sizes of resizable structures + the requested sizes of the
fixed structure. initial_shared_memory_size is the sum of the initial
sizes requested for all the structures.
3. Testing the memory allocation
I couldn't find a way to reliably know the shared memory allocated at
a given address in the process. RSS Shmem given the amount of shared
memory accessed by the process which includes memory allocated to the
fixed structures accessed by the process. This value isn't stable
across runs of the test in the patch. The test adds the RSS shmem
reported against the variations in the resizable shared memory
structure which can be visually inspected to be within limits. But
those limits are hard to test in the test code. Looking for some
suggestions here.
Disabling resizable structures in the builds which do not support
resizable structures is still a TODO.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] 0001-wip-Introduce-a-new-way-of-registering-shar-20260223.patch (53.8K, 2-0001-wip-Introduce-a-new-way-of-registering-shar-20260223.patch)
download | inline diff:
From 49676c5ba088d13236f2c1c66800d7e7b1abbe5f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 9 Feb 2026 22:28:23 +0200
Subject: [PATCH 1/3] wip: Introduce a new way of registering shared memory
structs
---
.../pg_stat_statements/pg_stat_statements.c | 112 ++++-----
src/backend/access/transam/varsup.c | 32 +--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 11 +-
src/backend/postmaster/postmaster.c | 2 +
src/backend/storage/ipc/dsm.c | 46 ++--
src/backend/storage/ipc/dsm_registry.c | 34 ++-
src/backend/storage/ipc/ipci.c | 51 ++--
src/backend/storage/ipc/pmsignal.c | 53 ++--
src/backend/storage/ipc/procarray.c | 127 +++++-----
src/backend/storage/ipc/procsignal.c | 63 +++--
src/backend/storage/ipc/shmem.c | 233 +++++++++++++++++-
src/backend/storage/ipc/sinvaladt.c | 39 +--
src/backend/storage/lmgr/proc.c | 156 ++++++------
src/backend/tcop/postgres.c | 2 +
src/include/access/transam.h | 12 +-
src/include/storage/dsm_registry.h | 3 +-
src/include/storage/ipc.h | 1 +
src/include/storage/pmsignal.h | 3 +-
src/include/storage/proc.h | 5 +-
src/include/storage/procarray.h | 3 +-
src/include/storage/procsignal.h | 3 +-
src/include/storage/shmem.h | 57 +++++
src/include/storage/sinvaladt.h | 3 +-
24 files changed, 665 insertions(+), 388 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 4a427533bd8..71debc8b47f 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,6 +258,25 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc = {
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .init_fn = pgss_shmem_init,
+};
+
+static ShmemHashDesc pgssSharedHashDesc = {
+ .name = "pg_stat_statements hash",
+ .init_size = 0, /* set from 'pgss_max' */
+ .max_size = 0, /* set from 'pgss_max' */
+};
+
+/* Links to shared memory state */
+#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
+#define pgss_hash (pgssSharedHashDesc.ptr)
+
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
@@ -274,10 +293,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -365,7 +380,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -500,11 +514,39 @@ _PG_init(void)
static void
pgss_shmem_request(void)
{
+ HASHCTL info;
+
if (prev_shmem_request_hook)
prev_shmem_request_hook();
- RequestAddinShmemSpace(pgss_memsize());
RequestNamedLWLockTranche("pg_stat_statements", 1);
+
+ /*
+ * Register our shared memory state, including hash table
+ */
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+ info.keysize = sizeof(pgssHashKey);
+ info.entrysize = sizeof(pgssEntry);
+ pgssSharedHashDesc.init_size = pgss_max;
+ pgssSharedHashDesc.max_size = pgss_max;
+ ShmemRegisterHash(&pgssSharedHashDesc,
+ &info,
+ HASH_ELEM | HASH_BLOBS);
+}
+
+static void
+pgss_shmem_init(void *arg)
+{
+ pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
}
/*
@@ -516,8 +558,6 @@ pgss_shmem_request(void)
static void
pgss_shmem_startup(void)
{
- bool found;
- HASHCTL info;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -530,42 +570,6 @@ pgss_shmem_startup(void)
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
- /*
- * Create or attach to the shared memory state, including hash table
- */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
-
/*
* If we're in the postmaster (or a standalone backend...), set up a shmem
* exit hook to dump the statistics to disk.
@@ -573,12 +577,6 @@ pgss_shmem_startup(void)
if (!IsUnderPostmaster)
on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
-
/*
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
@@ -2082,20 +2080,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..11ad90e7372 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,27 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
-/* pointer to variables struct in shared memory */
-TransamVariablesData *TransamVariables = NULL;
+static void VarsupShmemInit(void *arg);
+ShmemStructDesc TransamVariablesShmemDesc = {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .init_fn = VarsupShmemInit,
+};
/*
* Initialization of shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
{
- return sizeof(TransamVariablesData);
+ ShmemRegisterStruct(&TransamVariablesShmemDesc);
}
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemInit(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
/*
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 7d32cd0e159..0ded7018e86 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -337,6 +337,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ RegisterShmemStructs();
+
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 926fd6f2700..8f638118cdf 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -104,12 +105,10 @@ typedef struct
char **LWLockTrancheNames;
int *LWLockCounter;
LWLockPadded *MainLWLockArray;
- slock_t *ProcStructLock;
PROC_HDR *ProcGlobal;
PGPROC *AuxiliaryProcs;
PGPROC *PreparedXactProcs;
volatile PMSignalData *PMSignalState;
- ProcSignalHeader *ProcSignal;
pid_t PostmasterPid;
TimestampTz PgStartTime;
TimestampTz PgReloadTime;
@@ -678,8 +677,12 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ RegisterShmemStructs();
+ }
+
/*
* Run the appropriate Main function
*/
@@ -735,12 +738,10 @@ save_backend_variables(BackendParameters *param,
param->LWLockTrancheNames = LWLockTrancheNames;
param->LWLockCounter = LWLockCounter;
param->MainLWLockArray = MainLWLockArray;
- param->ProcStructLock = ProcStructLock;
param->ProcGlobal = ProcGlobal;
param->AuxiliaryProcs = AuxiliaryProcs;
param->PreparedXactProcs = PreparedXactProcs;
param->PMSignalState = PMSignalState;
- param->ProcSignal = ProcSignal;
param->PostmasterPid = PostmasterPid;
param->PgStartTime = PgStartTime;
@@ -995,12 +996,10 @@ restore_backend_variables(BackendParameters *param)
LWLockTrancheNames = param->LWLockTrancheNames;
LWLockCounter = param->LWLockCounter;
MainLWLockArray = param->MainLWLockArray;
- ProcStructLock = param->ProcStructLock;
ProcGlobal = param->ProcGlobal;
AuxiliaryProcs = param->AuxiliaryProcs;
PreparedXactProcs = param->PreparedXactProcs;
PMSignalState = param->PMSignalState;
- ProcSignal = param->ProcSignal;
PostmasterPid = param->PostmasterPid;
PgStartTime = param->PgStartTime;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d6133bfebc6..f6d3369f917 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -968,6 +968,8 @@ PostmasterMain(int argc, char *argv[])
* shared memory, determine the value of any runtime-computed GUCs that
* depend on the amount of shared memory required.
*/
+ RegisterShmemStructs();
+
InitializeShmemGUCs();
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..55f46c7687e 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -108,7 +108,15 @@ static inline bool is_main_region_dsm_handle(dsm_handle handle);
static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
-static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_init(void *);
+
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+ .name = "Preallocated DSM",
+ .size = 0, /* dynamic */
+ .init_fn = dsm_main_space_init,
+};
+
+#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
/*
* List of dynamic shared memory segments used by this backend.
@@ -479,27 +487,29 @@ void
dsm_shmem_init(void)
{
size_t size = dsm_estimate_size();
- bool found;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
+
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 068c1577b12..882af83b7b2 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -54,7 +54,15 @@ typedef struct DSMRegistryCtxStruct
dshash_table_handle dshh;
} DSMRegistryCtxStruct;
-static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryCtxShmemInit(void *arg);
+
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .init_fn = DSMRegistryCtxShmemInit,
+};
+
+#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
typedef struct NamedDSMState
{
@@ -113,27 +121,17 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 1f7e933d500..952988645d0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,13 +101,14 @@ CalculateShmemSize(void)
size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
sizeof(ShmemIndexEnt)));
size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
+
+ size = add_size(size, ShmemRegisteredSize());
+
+ /* legacy subsystmes */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -117,11 +118,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -217,6 +214,10 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
/* Initialize subsystems */
CreateOrAttachShmemStructs();
@@ -230,6 +231,19 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+void
+RegisterShmemStructs(void)
+{
+ DSMRegistryShmemRegister();
+
+ ProcGlobalShmemRegister();
+ VarsupShmemRegister();
+ ProcArrayShmemRegister();
+ SharedInvalShmemRegister();
+ PMSignalShmemRegister();
+ ProcSignalShmemRegister();
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
@@ -259,14 +273,23 @@ CreateOrAttachShmemStructs(void)
*/
InitShmemIndex();
+#ifdef EXEC_BACKEND
+ if (IsUnderPostmaster)
+ ShmemAttachRegistered();
+ else
+#endif
+ {
+ ShmemInitRegistered();
+ }
+
dsm_shmem_init();
- DSMRegistryShmemInit();
+ //DSMRegistryShmemInit();
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
+
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
CLOGShmemInit();
@@ -288,23 +311,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..23752500d16 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -80,9 +80,24 @@ struct PMSignalData
sig_atomic_t PMChildFlags[FLEXIBLE_ARRAY_MEMBER];
};
-/* PMSignalState pointer is valid in both postmaster and child processes */
+static void PMSignalShmemInit(void *);
+
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* dynamic */
+ .init_fn = PMSignalShmemInit,
+};
+
+/*
+ * PMSignalState pointer is valid in both postmaster and child processes
+ *
+ * This is a stand-alone variable rather than just a #define over
+ * PMSignalShmemDesc.ptr because it is needed early at backend startup and
+ * passed as a backend parameter in EXEC_BACKEND mode
+ */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +138,28 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register our shared memory
*/
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
{
Size size;
size = offsetof(PMSignalData, PMChildFlags);
size = add_size(size, mul_size(MaxLivePostmasterChildren(),
sizeof(sig_atomic_t)));
-
- return size;
+ PMSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&PMSignalShmemDesc);
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ /* initialize all flags to zeroes */
+ PMSignalState = PMSignalShmemDesc.ptr;
+ MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+ num_child_flags = MaxLivePostmasterChildren();
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
@@ -291,6 +295,7 @@ RegisterPostmasterChildActive(void)
{
int slot = MyPMChildSlot;
+ Assert(PMSignalState);
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 301f54fb5a8..08c63bcb2a7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -101,6 +101,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+ .name = "Proc Array",
+ .size = 0, /* dynamic */
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
+#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -267,9 +279,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -280,8 +289,23 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
-static TransactionId *KnownAssignedXids;
-static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+ .name = "KnownAssignedXids",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+};
+
+#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+ .name = "KnownAssignedXidsValid",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+};
+
+#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -372,18 +396,19 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ /* Create or attach to the ProcArray shared structure */
+ ProcArrayShmemDesc.size =
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
+ PROCARRAY_MAXPROCS));
+ ShmemRegisterStruct(&ProcArrayShmemDesc);
/*
* During Hot Standby processing we have a data structure called
@@ -403,64 +428,38 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ KnownAssignedXidsShmemDesc.size =
+ mul_size(sizeof(TransactionId),
+ TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+ KnownAssignedXidsValidShmemDesc.size =
+ mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
}
-
- return size;
}
-/*
- * Initialize the shared PGPROC array during postmaster startup.
- */
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 8e56922dcea..5743f088324 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -102,7 +102,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
-NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+static void ProcSignalShmemInit(void *arg);
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+ .name = "ProcSignal",
+ .size = 0, /* dynamic */
+ .init_fn = ProcSignalShmemInit,
+};
+
+#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -110,51 +119,37 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ProcSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcSignalShmemDesc);
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 9f362ce8641..faa0fcbd21e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,6 +19,8 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
+ * FIXME: NOTES below are outdated
+ *
* NOTES:
* (a) There are three kinds of shared memory data structures
* available to POSTGRES: fixed-size structures, queues and hash
@@ -76,6 +78,16 @@
#include "storage/spin.h"
#include "utils/builtins.h"
+/* size constants for the shmem index table */
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+ /* estimated size of the shmem index table (not a hard limit) */
+#define SHMEM_INDEX_SIZE (64)
+
+/* these are in postmaster private memory */
+static ShmemStructDesc *registry[SHMEM_INDEX_SIZE];
+static int num_registrations = 0;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -95,6 +107,9 @@ typedef struct ShmemAllocatorData
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
/* shared memory global variables */
static PGShmemHeader *ShmemSegHdr; /* shared mem segment header */
@@ -103,13 +118,137 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+
+static ShmemHashDesc ShmemIndexHashDesc = {
+ .name = "ShmemIndex",
+ .init_size = SHMEM_INDEX_SIZE,
+ .max_size = SHMEM_INDEX_SIZE,
+};
+
+ /* primary index hashtable for shmem */
+#define ShmemIndex (ShmemIndexHashDesc.ptr)
+
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
+
+void
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+ elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
+
+ registry[num_registrations++] = desc;
+}
+
+size_t
+ShmemRegisteredSize(void)
+{
+ size_t size;
+
+ size = 0;
+ for (int i = 0; i < num_registrations; i++)
+ {
+ size = add_size(size, registry[i]->size);
+ size = add_size(size, registry[i]->extra_size);
+ }
+
+ elog(DEBUG2, "SIZE: total %zd", size);
+
+ return size;
+}
+
+void
+ShmemInitRegistered(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+
+ for (int i = 0; i < num_registrations; i++)
+ {
+ size_t allocated_size;
+ void *structPtr;
+ bool found;
+ ShmemIndexEnt *result;
+
+ elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, registry[i]->name, HASH_ENTER_NULL, &found);
+ if (!result)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ registry[i]->name)));
+ }
+ if (found)
+ elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
+
+ /* allocate and initialize it */
+ structPtr = ShmemAllocRaw(registry[i]->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, registry[i]->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ registry[i]->name, registry[i]->size)));
+ }
+ result->size = registry[i]->size;
+ result->allocated_size = allocated_size;
+ result->location = structPtr;
+
+ registry[i]->ptr = structPtr;
+ if (registry[i]->init_fn)
+ registry[i]->init_fn(registry[i]->init_fn_arg);
+ }
+}
+
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ for (int i = 0; i < num_registrations; i++)
+ {
+ bool found;
+ ShmemIndexEnt *result;
+
+ elog(LOG, "ATTACH [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, registry[i]->name, HASH_FIND, &found);
+ if (!found)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ registry[i]->name)));
+ }
+
+ registry[i]->ptr = result->location;
+
+ if (registry[i]->attach_fn)
+ registry[i]->attach_fn(registry[i]->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+}
+#endif
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -292,6 +431,98 @@ InitShmemIndex(void)
HASH_ELEM | HASH_STRINGS);
}
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+void
+ShmemRegisterHash(ShmemHashDesc *desc, /* configuration */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->dsize = infoP->max_dsize = hash_select_dirsize(desc->max_size);
+ infoP->alloc = ShmemAllocNoError;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* look it up in the shmem index */
+ memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+ desc->base_desc.name = desc->name;
+ desc->base_desc.size = hash_get_shared_size(infoP, hash_flags);
+ desc->base_desc.init_fn = shmem_hash_init;
+ desc->base_desc.init_fn_arg = desc;
+ desc->base_desc.attach_fn = shmem_hash_attach;
+ desc->base_desc.attach_fn_arg = desc;
+
+ desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
+
+ desc->hash_flags = hash_flags;
+ desc->infoP = MemoryContextAlloc(TopMemoryContext, sizeof(HASHCTL));
+ memcpy(desc->infoP, infoP, sizeof(HASHCTL));
+
+ ShmemRegisterStruct(&desc->base_desc);
+}
+
+static void
+shmem_hash_init(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
+
+ /* Pass location of hashtable header to hash_create */
+ desc->ptr = desc->base_desc.ptr;
+ desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+ desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
+static void
+shmem_hash_attach(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ hash_flags |= HASH_ATTACH;
+
+ /* Pass location of hashtable header to hash_create */
+ desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+ desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..0fe0f256971 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,7 +203,16 @@ typedef struct SISeg
*/
#define NumProcStateSlots (MaxBackends + NUM_AUXILIARY_PROCS)
-static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemInit(void *arg);
+
+static ShmemStructDesc SharedInvalShmemDesc = {
+ .name = "shmInvalBuffer",
+ .size = 0, /* dynamic */
+ .init_fn = SharedInvalShmemInit,
+};
+
+/* pointer to the shared inval buffer */
+#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
{
Size size;
@@ -223,26 +233,17 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ /* Allocate space in shared memory */
+ SharedInvalShmemDesc.size = size;
+ ShmemRegisterStruct(&SharedInvalShmemDesc);
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, save size of procState array FIXME, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index c7a001b3b79..85375b5195e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -73,13 +73,33 @@ PGPROC *MyProc = NULL;
* relatively infrequently (only at backend startup or shutdown) and not for
* very long, so a spinlock is okay.
*/
-NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
+#define ProcStructLock (&ProcGlobal->freeProcsLock)
+
+static void ProcGlobalShmemInit(void *arg);
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+ .name = "PGPROC structures",
+ .size = 0, /* dynamic */
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+ .name = "Fast-Path Lock Array",
+ .size = 0, /* dynamic */
+};
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static uint32 TotalProcs;
+
static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
/* Is a deadlock check pending? */
@@ -91,24 +111,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static void CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -116,8 +118,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -133,25 +133,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -186,35 +167,63 @@ ProcGlobalSemas(void)
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
*
- * Note: this is NOT called by individual backends under a postmaster,
+ * Note: this is NOT called by individual backends under a postmaster, XXX
* not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
* pointers must be propagated specially for EXEC_BACKEND operation.
*/
void
-InitProcGlobal(void)
+ProcGlobalShmemRegister(void)
+{
+ Size size = 0;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are
+ * six separate consumers: (1) normal backends, (2) autovacuum workers and
+ * special workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+
+ /* FIXME: the sizeofs look dangerous because ProcGlobal is not initialized yet */
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+ ProcGlobalAllProcsShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+ FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+ ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+ /*
+ * Create the ProcGlobal shared structure last. Its init callback
+ * initializes the others too.
+ */
+ ShmemRegisterStruct(&ProcGlobalShmemDesc);
+}
+
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
-
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
+ ProcGlobal = ProcGlobalShmemDesc.ptr;
- /*
- * Initialize the data structures.
- */
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
dlist_init(&ProcGlobal->freeProcs);
dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -225,23 +234,11 @@ InitProcGlobal(void)
ProcGlobal->checkpointerProc = INVALID_PROC_NUMBER;
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
+ SpinLockInit(ProcStructLock);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
- MemSet(ptr, 0, requestSize);
+ ptr = ProcGlobalAllProcsShmemDesc.ptr;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
+ memset(ptr, 0, requestSize);
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -277,20 +274,13 @@ InitProcGlobal(void)
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemDesc.ptr;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
@@ -380,12 +370,6 @@ InitProcGlobal(void)
*/
AuxiliaryProcs = &procs[MaxBackends];
PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
-
- /* Create ProcStructLock spinlock, too */
- ProcStructLock = (slock_t *) ShmemInitStruct("ProcStructLock spinlock",
- sizeof(slock_t),
- &found);
- SpinLockInit(ProcStructLock);
}
/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 02e9aaa6bca..eed188416ee 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4117,6 +4117,8 @@ PostgresSingleUserMain(int argc, char *argv[],
* shared memory, determine the value of any runtime-computed GUCs that
* depend on the amount of shared memory required.
*/
+ RegisterShmemStructs();
+
InitializeShmemGUCs();
/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..49d476e9d5c 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,7 +15,9 @@
#define TRANSAM_H
#include "access/xlogdefs.h"
-
+#ifndef FRONTEND
+#include "storage/shmem.h"
+#endif
/* ----------------
* Special transaction ID values
@@ -330,7 +332,10 @@ TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2)
extern bool TransactionStartedDuringRecovery(void);
/* in transam/varsup.c */
-extern PGDLLIMPORT TransamVariablesData *TransamVariables;
+#ifndef FRONTEND
+extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
+#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+#endif
/*
* prototypes for functions in transam/transam.c
@@ -345,8 +350,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterShmemStructs(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 679f0624f92..37023e1a93f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -418,6 +418,9 @@ typedef struct PROC_HDR
dlist_head bgworkerFreeProcs;
/* Head of list of walsender free PGPROC structures */
dlist_head walsenderFreeProcs;
+
+ slock_t freeProcsLock;
+
/* First pgproc waiting for group XID clear */
pg_atomic_uint32 procArrayGroupFirst;
/* First pgproc waiting for group transaction status update */
@@ -488,7 +491,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 3a8593f87ba..41753c3a630 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index e52b8eb7697..f2df1f30c5f 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -71,8 +71,7 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..40e2fc17056 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,6 +24,53 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Descriptor for a named area or struct in shared memory
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ size_t size;
+
+ size_t alignment;
+ ShmemInitCallback init_fn;
+ ShmemInitCallback attach_fn;
+ void *init_fn_arg;
+ void *attach_fn_arg;
+
+ /*
+ * Extra space to allocated in the shared memory segment, but it's not
+ * part of the struct itself. This is used for shared memory hash tables
+ * that can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /* Pointer to the shared memory area, when it's allocated. */
+ void *ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for shared memory hash table
+ */
+typedef struct ShmemHashDesc
+{
+ const char *name;
+
+ int hash_flags;
+
+ size_t init_size; /* initial number of entries */
+ size_t max_size; /* max number of entries */
+ HASHCTL *infoP;
+
+ HTAB *ptr;
+
+ ShmemStructDesc base_desc;
+} ShmemHashDesc;
/* shmem.c */
extern PGDLLIMPORT slock_t *ShmemLock;
@@ -34,9 +81,19 @@ extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
extern void InitShmemIndex(void);
+
+extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
+extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+
+/* Legacy functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+extern void ShmemAttachRegistered(void);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
base-commit: c67bef3f3252a3a38bf347f9f119944176a796ce
--
2.34.1
[text/x-patch] 0002-Get-rid-of-global-shared-memory-pointer-mac-20260223.patch (15.0K, 3-0002-Get-rid-of-global-shared-memory-pointer-mac-20260223.patch)
download | inline diff:
From 395a95e9934286869b1fe8d45dc5a155ea9be030 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 10 Feb 2026 20:26:31 +0530
Subject: [PATCH 2/3] Get rid of global shared memory pointer macro
declarations
---
.../pg_stat_statements/pg_stat_statements.c | 10 ++++---
src/backend/access/transam/varsup.c | 5 ++++
src/backend/storage/ipc/dsm.c | 5 ++--
src/backend/storage/ipc/dsm_registry.c | 4 ++-
src/backend/storage/ipc/pmsignal.c | 18 +++++-------
src/backend/storage/ipc/procarray.c | 13 +++++----
src/backend/storage/ipc/procsignal.c | 4 ++-
src/backend/storage/ipc/shmem.c | 29 ++++++++++++-------
src/backend/storage/ipc/sinvaladt.c | 7 +++--
src/backend/storage/lmgr/proc.c | 25 ++++++++++------
src/include/access/transam.h | 2 +-
src/include/storage/shmem.h | 6 ++--
12 files changed, 77 insertions(+), 51 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 71debc8b47f..73fdf561419 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,24 +258,26 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+pgssSharedState *pgss = NULL;
+HTAB *pgss_hash = NULL;
+
static void pgss_shmem_init(void *arg);
static ShmemStructDesc pgssSharedStateShmemDesc = {
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.init_fn = pgss_shmem_init,
+ .ptr = (void *) &pgss,
};
static ShmemHashDesc pgssSharedHashDesc = {
.name = "pg_stat_statements hash",
.init_size = 0, /* set from 'pgss_max' */
.max_size = 0, /* set from 'pgss_max' */
+ .ptr = &pgss_hash,
};
-/* Links to shared memory state */
-#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
-#define pgss_hash (pgssSharedHashDesc.ptr)
-
/*---- Local variables ----*/
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 11ad90e7372..3dfda875e80 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -32,10 +32,14 @@
static void VarsupShmemInit(void *arg);
+/* pointer to variables struct in shared memory */
+TransamVariablesData *TransamVariables = NULL;
+
ShmemStructDesc TransamVariablesShmemDesc = {
.name = "TransamVariables",
.size = sizeof(TransamVariablesData),
.init_fn = VarsupShmemInit,
+ .ptr = (void **) &TransamVariables,
};
/*
@@ -49,6 +53,7 @@ VarsupShmemRegister(void)
static void
VarsupShmemInit(void *arg)
+
{
memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 55f46c7687e..73644ec3bbb 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -110,14 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void dsm_main_space_init(void *);
+static void *dsm_main_space_begin = NULL;
+
static ShmemStructDesc dsm_main_space_shmem_desc = {
.name = "Preallocated DSM",
.size = 0, /* dynamic */
.init_fn = dsm_main_space_init,
+ .ptr = &dsm_main_space_begin,
};
-#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
-
/*
* List of dynamic shared memory segments used by this backend.
*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 882af83b7b2..1659e1dd71d 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -56,13 +56,15 @@ typedef struct DSMRegistryCtxStruct
static void DSMRegistryCtxShmemInit(void *arg);
+DSMRegistryCtxStruct *DSMRegistryCtx = NULL;
+
static ShmemStructDesc DSMRegistryCtxShmemDesc = {
.name = "DSM Registry Data",
.size = sizeof(DSMRegistryCtxStruct),
.init_fn = DSMRegistryCtxShmemInit,
+ .ptr = (void **) &DSMRegistryCtx,
};
-#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
typedef struct NamedDSMState
{
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 23752500d16..3aa0380eadd 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -82,21 +82,17 @@ struct PMSignalData
static void PMSignalShmemInit(void *);
-static ShmemStructDesc PMSignalShmemDesc = {
- .name = "PMSignalState",
- .size = 0, /* dynamic */
- .init_fn = PMSignalShmemInit,
-};
-
/*
* PMSignalState pointer is valid in both postmaster and child processes
- *
- * This is a stand-alone variable rather than just a #define over
- * PMSignalShmemDesc.ptr because it is needed early at backend startup and
- * passed as a backend parameter in EXEC_BACKEND mode
*/
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* dynamic */
+ .init_fn = PMSignalShmemInit,
+ .ptr = (void **) &PMSignalState,
+};
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
@@ -156,7 +152,7 @@ static void
PMSignalShmemInit(void *arg)
{
/* initialize all flags to zeroes */
- PMSignalState = PMSignalShmemDesc.ptr;
+ Assert(PMSignalState);
MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
num_child_flags = MaxLivePostmasterChildren();
PMSignalState->num_child_flags = num_child_flags;
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 08c63bcb2a7..736504d3a3e 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -104,15 +104,16 @@ typedef struct ProcArrayStruct
static void ProcArrayShmemInit(void *arg);
static void ProcArrayShmemAttach(void *arg);
+ProcArrayStruct *procArray = NULL;
+
static ShmemStructDesc ProcArrayShmemDesc = {
.name = "Proc Array",
.size = 0, /* dynamic */
.init_fn = ProcArrayShmemInit,
.attach_fn = ProcArrayShmemAttach,
+ .ptr = (void **) &procArray,
};
-#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
-
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -290,22 +291,24 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
* Bookkeeping for tracking emulated transactions in recovery
*/
+TransactionId *KnownAssignedXids = NULL;
+
static ShmemStructDesc KnownAssignedXidsShmemDesc = {
.name = "KnownAssignedXids",
.size = 0, /* dynamic */
.init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXids,
};
-#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+bool *KnownAssignedXidsValid = NULL;
static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
.name = "KnownAssignedXidsValid",
.size = 0, /* dynamic */
.init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXidsValid,
};
-#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
-
static TransactionId latestObservedXid = InvalidTransactionId;
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 5743f088324..eec04eae3f4 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -104,13 +104,15 @@ struct ProcSignalHeader
static void ProcSignalShmemInit(void *arg);
+ProcSignalHeader *ProcSignal = NULL;
+
static ShmemStructDesc ProcSignalShmemDesc = {
.name = "ProcSignal",
.size = 0, /* dynamic */
.init_fn = ProcSignalShmemInit,
+ .ptr = (void **) &ProcSignal,
};
-#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
static ProcSignalSlot *MyProcSignalSlot = NULL;
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index faa0fcbd21e..e73ac489b2b 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -118,17 +118,18 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
+ /* primary index hashtable for shmem */
+HTAB *ShmemIndex = NULL;
+
static ShmemHashDesc ShmemIndexHashDesc = {
.name = "ShmemIndex",
.init_size = SHMEM_INDEX_SIZE,
.max_size = SHMEM_INDEX_SIZE,
+ .ptr = &ShmemIndex
};
- /* primary index hashtable for shmem */
-#define ShmemIndex (ShmemIndexHashDesc.ptr)
-
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
@@ -205,7 +206,7 @@ ShmemInitRegistered(void)
result->allocated_size = allocated_size;
result->location = structPtr;
- registry[i]->ptr = structPtr;
+ *(registry[i]->ptr) = structPtr;
if (registry[i]->init_fn)
registry[i]->init_fn(registry[i]->init_fn_arg);
}
@@ -239,7 +240,7 @@ ShmemAttachRegistered(void)
registry[i]->name)));
}
- registry[i]->ptr = result->location;
+ *registry[i]->ptr = result->location;
if (registry[i]->attach_fn)
registry[i]->attach_fn(registry[i]->attach_fn_arg);
@@ -425,10 +426,11 @@ InitShmemIndex(void)
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
- ShmemIndex = ShmemInitHash("ShmemIndex",
+ *ShmemIndexHashDesc.ptr = ShmemInitHash("ShmemIndex",
SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
&info,
HASH_ELEM | HASH_STRINGS);
+ Assert(ShmemIndex != NULL && *ShmemIndexHashDesc.ptr == ShmemIndex);
}
/*
@@ -482,6 +484,12 @@ ShmemRegisterHash(ShmemHashDesc *desc, /* configuration */
desc->base_desc.init_fn_arg = desc;
desc->base_desc.attach_fn = shmem_hash_attach;
desc->base_desc.attach_fn_arg = desc;
+ /*
+ * We need a stable pointer to hold the pointer to the shared memory. Use
+ * the one passed in the descriptor now. It will be replaced with the hash
+ * table header by init or attach function.
+ */
+ desc->base_desc.ptr = (void **) desc->ptr;
desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
@@ -499,10 +507,9 @@ shmem_hash_init(void *arg)
int hash_flags = desc->hash_flags;
/* Pass location of hashtable header to hash_create */
- desc->ptr = desc->base_desc.ptr;
- desc->infoP->hctl = (HASHHDR *) desc->ptr;
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
}
static void
@@ -518,9 +525,9 @@ shmem_hash_attach(void *arg)
hash_flags |= HASH_ATTACH;
/* Pass location of hashtable header to hash_create */
- desc->infoP->hctl = (HASHHDR *) desc->ptr;
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
}
/*
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 0fe0f256971..8321bd9b52d 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -205,15 +205,16 @@ typedef struct SISeg
static void SharedInvalShmemInit(void *arg);
+/* pointer to the shared inval buffer */
+SISeg *shmInvalBuffer = NULL;
+
static ShmemStructDesc SharedInvalShmemDesc = {
.name = "shmInvalBuffer",
.size = 0, /* dynamic */
.init_fn = SharedInvalShmemInit,
+ .ptr = (void **) &shmInvalBuffer,
};
-/* pointer to the shared inval buffer */
-#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
-
static LocalTransactionId nextLocalTransactionId;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 85375b5195e..a3d6557aa9d 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -77,27 +77,33 @@ PGPROC *MyProc = NULL;
static void ProcGlobalShmemInit(void *arg);
+/* Pointers to shared-memory structures */
+PROC_HDR *ProcGlobal = NULL;
+void *tmpAllProcs = NULL;
+void *tmpFastPathLockArray = NULL;
+NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC *PreparedXactProcs = NULL;
+
+
static ShmemStructDesc ProcGlobalShmemDesc = {
.name = "Proc Header",
.size = sizeof(PROC_HDR),
.init_fn = ProcGlobalShmemInit,
+ .ptr = (void **) &ProcGlobal,
};
static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
.name = "PGPROC structures",
.size = 0, /* dynamic */
+ .ptr = (void **) &tmpAllProcs,
};
static ShmemStructDesc FastPathLockArrayShmemDesc = {
.name = "Fast-Path Lock Array",
.size = 0, /* dynamic */
+ .ptr = (void **) &tmpFastPathLockArray,
};
-/* Pointers to shared-memory structures */
-PROC_HDR *ProcGlobal = NULL;
-NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
-PGPROC *PreparedXactProcs = NULL;
-
static uint32 TotalProcs;
static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
@@ -222,8 +228,7 @@ ProcGlobalShmemInit(void *arg)
Size fpLockBitsSize,
fpRelIdSize;
- ProcGlobal = ProcGlobalShmemDesc.ptr;
-
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
dlist_init(&ProcGlobal->freeProcs);
dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -236,7 +241,8 @@ ProcGlobalShmemInit(void *arg)
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
SpinLockInit(ProcStructLock);
- ptr = ProcGlobalAllProcsShmemDesc.ptr;
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
requestSize = ProcGlobalAllProcsShmemDesc.size;
memset(ptr, 0, requestSize);
@@ -274,7 +280,8 @@ ProcGlobalShmemInit(void *arg)
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- fpPtr = FastPathLockArrayShmemDesc.ptr;
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
requestSize = FastPathLockArrayShmemDesc.size;
memset(fpPtr, 0, requestSize);
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 49d476e9d5c..6e5a546f411 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -334,7 +334,7 @@ extern bool TransactionStartedDuringRecovery(void);
/* in transam/varsup.c */
#ifndef FRONTEND
extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
-#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+extern PGDLLIMPORT TransamVariablesData *TransamVariables;
#endif
/*
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 40e2fc17056..cbd4ef8d03f 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,8 +50,8 @@ typedef struct ShmemStructDesc
*/
size_t extra_size;
- /* Pointer to the shared memory area, when it's allocated. */
- void *ptr;
+ /* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
+ void **ptr;
} ShmemStructDesc;
/*
@@ -67,7 +67,7 @@ typedef struct ShmemHashDesc
size_t max_size; /* max number of entries */
HASHCTL *infoP;
- HTAB *ptr;
+ HTAB **ptr;
ShmemStructDesc base_desc;
} ShmemHashDesc;
--
2.34.1
[text/x-patch] 0003-WIP-resizable-shared-memory-structures-20260223.patch (38.9K, 4-0003-WIP-resizable-shared-memory-structures-20260223.patch)
download | inline diff:
From 1f88d0bab7bb6b3ae6d9ecc573c7d6a621f03d2c Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH 3/3] WIP: resizable shared memory structures
---
doc/src/sgml/system-views.sgml | 20 +-
src/backend/port/sysv_shmem.c | 69 +++++
src/backend/port/win32_shmem.c | 49 +++
src/backend/storage/ipc/shmem.c | 156 ++++++++--
src/include/catalog/pg_proc.dat | 4 +-
src/include/storage/pg_shmem.h | 3 +
src/include/storage/shmem.h | 10 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 23 ++
src/test/modules/resizable_shmem/meson.build | 36 +++
.../resizable_shmem/resizable_shmem--1.0.sql | 37 +++
.../modules/resizable_shmem/resizable_shmem.c | 281 ++++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 5 +
.../resizable_shmem/t/001_resizable_shmem.pl | 118 ++++++++
src/test/regress/expected/rules.out | 5 +-
16 files changed, 789 insertions(+), 29 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8b4abef8c68..533eff3d6cb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,24 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the allocation can grow upto including padding
+ in case of resizable allocations. For anonymous allocations, no
+ information about maximum size is available, so the
+ <literal>size</literal> and <literal>maximum_size</literal> columns will
+ always be equal. Maximum size is not meaningful for free memory, so the
+ columns will be equal in that case also.
</para></entry>
</row>
</tbody>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..67a39e97007 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size of being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,51 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("releasing shared memory is supported only in anonymous mappings")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ Assert(size > 0);
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not release shared memory: %m")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("allocating shared memory is supported only in anonymous mappings")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ Assert(size > 0);
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not release shared memory: %m")));
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..afbbd0da8da 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -621,6 +621,32 @@ pgwin32_ReserveSharedMemoryRegion(HANDLE hChild)
return true;
}
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * Not supported on Windows currently.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("releasing part of shared memory is not supported on windows")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * Not supported on Windows currently.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("allocating shared memory is not supported on windows")));
+}
+
/*
* This function is provided for consistency with sysv_shmem.c and does not
* provide any useful information for Windows. To obtain the large page size,
@@ -648,3 +674,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index e73ac489b2b..4ecb354fd06 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -106,6 +106,7 @@ typedef struct ShmemAllocatorData
} ShmemAllocatorData;
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void ShmemSetAllocatedSize(ShmemIndexEnt *entry);
static void shmem_hash_init(void *arg);
static void shmem_hash_attach(void *arg);
@@ -143,8 +144,20 @@ ShmemRegisterStruct(ShmemStructDesc *desc)
elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
registry[num_registrations++] = desc;
+
+ if (desc->max_size > 0)
+ elog(DEBUG2, "RESIZABLE structure: %s has max_size %zd", desc->name, desc->max_size);
}
+/*
+ * Calculate the total size of shared memory required for the registered
+ * structures.
+ *
+ * Resizable structures need contiguous memory worth the specified maximum size
+ * when they grow to the fullest. Hence use max_size. It is expected that that
+ * much address space is reserved. Actual memory allocated at the beginning will
+ * be worth the total of initial sizes of all the structures.
+ */
size_t
ShmemRegisteredSize(void)
{
@@ -153,7 +166,7 @@ ShmemRegisteredSize(void)
size = 0;
for (int i = 0; i < num_registrations; i++)
{
- size = add_size(size, registry[i]->size);
+ size = add_size(size, registry[i]->max_size > 0 ? registry[i]->max_size : registry[i]->size);
size = add_size(size, registry[i]->extra_size);
}
@@ -162,6 +175,43 @@ ShmemRegisteredSize(void)
return size;
}
+/*
+ * Set the allocated_size of given structure.
+ */
+static void
+ShmemSetAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ if (align_end >= floor_max_end)
+ {
+ /*
+ * A fixed sized structure or a resizable structure whose maximal size
+ * ends on the same page as its initial size. In either case, the
+ * structure will be allocated in its entirety at the beginning and there
+ * is no need to allocate additional memory for it when it grows. So, set
+ * allocated_size to maximum_size.
+ */
+ entry->allocated_size = entry->maximum_size;
+ }
+ else
+ {
+ /*
+ * The maximal structure spans multiple pages. Initially only
+ * the pages where this structure ends and where the next structure
+ * starts will be allocated.
+ */
+ entry->allocated_size = entry->maximum_size - (floor_max_end - align_end);
+ }
+}
+
+
+/*
+ * Allocate memory for the registered shared structures and initialize them.
+ */
void
ShmemInitRegistered(void)
{
@@ -170,10 +220,11 @@ ShmemInitRegistered(void)
for (int i = 0; i < num_registrations; i++)
{
- size_t allocated_size;
+ Size max_alloc_size;
void *structPtr;
bool found;
ShmemIndexEnt *result;
+ Size struct_size;
elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
@@ -190,8 +241,14 @@ ShmemInitRegistered(void)
if (found)
elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
- /* allocate and initialize it */
- structPtr = ShmemAllocRaw(registry[i]->size, &allocated_size);
+ /*
+ * Allocate space for the structure in the shared memory. The memory
+ * allocation happens as the corresponding pages are written to. For a
+ * resizable structure allocate enough space for it to grow to its
+ * maximum size, not just worth its initial size.
+ */
+ struct_size = registry[i]->max_size > 0 ? registry[i]->max_size : registry[i]->size;
+ structPtr = ShmemAllocRaw(struct_size, &max_alloc_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -202,9 +259,10 @@ ShmemInitRegistered(void)
" \"%s\" (%zu bytes requested)",
registry[i]->name, registry[i]->size)));
}
- result->size = registry[i]->size;
- result->allocated_size = allocated_size;
result->location = structPtr;
+ result->size = registry[i]->size;
+ result->maximum_size = max_alloc_size;
+ ShmemSetAllocatedSize(result);
*(registry[i]->ptr) = structPtr;
if (registry[i]->init_fn)
@@ -212,6 +270,62 @@ ShmemInitRegistered(void)
}
}
+void
+ShmemResizeRegistered(const char *name, Size new_size)
+{
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ elog(ERROR, "shmem struct \"%s\" is not initialized", name);
+
+ Assert(result);
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\"", name)));
+
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the start
+ * of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+ Size free_size = max_end - new_end;
+
+ if (free_size > 0)
+ PGSharedMemoryEnsureFreed(new_end, free_size);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+ Size alloc_size = new_end - struct_start;
+
+ if (alloc_size > 0)
+ PGSharedMemoryEnsureAllocated(struct_start, alloc_size);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ ShmemSetAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+}
+
#ifdef EXEC_BACKEND
void
ShmemAttachRegistered(void)
@@ -701,6 +815,7 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
result->size = size;
result->allocated_size = allocated_size;
result->location = structPtr;
+ result->maximum_size = allocated_size;
}
LWLockRelease(ShmemIndexLock);
@@ -747,7 +862,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 5
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -769,7 +884,14 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
- named_allocated += ent->allocated_size;
+ values[4] = Int64GetDatum(ent->maximum_size);
+
+ /*
+ * Resizable structures are allocated address space upto their maximum
+ * size, that's what we are counting here - allocated space. For fixed
+ * sized structures, allocated_size is same as the maximum_size.
+ */
+ named_allocated += ent->maximum_size;
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -780,6 +902,7 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -788,6 +911,7 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -975,23 +1099,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 83f6501df38..fbf5749bca7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8592,8 +8592,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,maximum_size}',
prosrc => 'pg_get_shmem_allocations' },
{ oid => '4099', descr => 'Is NUMA support available?',
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..f0efbf2aec1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,9 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index cbd4ef8d03f..f25b60b5f42 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,6 +50,14 @@ typedef struct ShmemStructDesc
*/
size_t extra_size;
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for resizable structures which need contiguous memory.
+ */
+ size_t max_size;
+
/* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
void **ptr;
} ShmemStructDesc;
@@ -84,6 +92,7 @@ extern void InitShmemIndex(void);
extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+extern void ShmemResizeRegistered(const char *name, Size new_size);
/* Legacy functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
@@ -115,6 +124,7 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+ Size maximum_size; /* maximum size this structure can grow to */
} ShmemIndexEnt;
#endif /* SHMEM_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..a5df6edae18 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..961bb62759d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..f3bd8ac0c7f
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/resizable_shmem/Makefile
+
+MODULES = resizable_shmem
+TAP_TESTS = 1
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/src/makefiles/pgxs.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..c1bcb6117b6
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,37 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..15f02e3f8ff
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,281 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/*
+ * Default amount of shared buffers and hence the amount of shared memory
+ * allocated by default is in hundreds of MBs. The memory allocated to the test
+ * structure will be noticeable only when it's in the same order.
+ */
+#define TEST_INITIAL_ENTRIES (25 * 1024 * 1024) /* Initial number of entries (100MB) */
+#define TEST_MAX_ENTRIES (100 * 1024 * 1024) /* Maximum number of entries (400MB, 4x initial) */
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * We do not use any locks. The test performs resizing, reads and writes none of
+ * which are concurrent to keep the code and the test simple.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+/* Global pointer to our shared memory structure */
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemStructDesc testShmemDesc = {
+ .name = "resizable_shmem",
+ .size = offsetof(TestResizableShmemStruct, data) + (TEST_INITIAL_ENTRIES * TEST_ENTRY_SIZE),
+ .max_size = offsetof(TestResizableShmemStruct, data) + (TEST_MAX_ENTRIES * TEST_ENTRY_SIZE),
+ .alignment = MAXIMUM_ALIGNOF,
+ .init_fn = resizable_shmem_shmem_init,
+ .ptr = (void **) &resizable_shmem,
+};
+
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+
+static void resizable_shmem_request(void);
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * The module needs to be loaded via shared_preload_libraries to register
+ * shared memory structure. But if that's not the case, don't throw an error.
+ * The SQL functions check for existence of the shared memory data structure.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+#ifdef EXEC_BACKEND
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable_shmem is not supported in EXEC_BACKEND builds")));
+#endif
+
+ /* Install hook to register shared memory structure. */
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = resizable_shmem_request;
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ /* Register our resizable shared memory structure */
+ ShmemRegisterStruct(&testShmemDesc);
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been allocated with the requested
+ * size. Initialize the metadata.
+ */
+ Assert(resizable_shmem != NULL);
+ resizable_shmem->num_entries = TEST_INITIAL_ENTRIES;
+ memset(resizable_shmem->data, 0, TEST_INITIAL_ENTRIES * TEST_ENTRY_SIZE);
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifndef EXEC_BACKEND
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = offsetof(TestResizableShmemStruct, data) + (new_entries * TEST_ENTRY_SIZE);
+ ShmemResizeRegistered(testShmemDesc.name, new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizing shared memory is not supported in EXEC_BACKEND builds")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Validate entry_count */
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ /* Check if first entry_count entries have the expected value */
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * TODO: See TODO note in SQL definition of this function.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+ FILE *f;
+ char line[256];
+ int64 rss_anon_kb = -1;
+ int64 rss_file_kb = -1;
+ int64 rss_shmem_kb = -1;
+ int64 vm_size_kb = -1;
+ int found = 0;
+ TupleDesc tupdesc;
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+
+ /* Open /proc/self/status to read memory information */
+ f = fopen("/proc/self/status", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/status: %m")));
+
+ /* Look for the memory usage lines */
+ while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ {
+ if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+ found++;
+ else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+ found++;
+ else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+ found++;
+ else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+ found++;
+ }
+
+ fclose(f);
+
+ /* Build tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept a record")));
+
+ /* Build the result tuple */
+ values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+ values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+ values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+ values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..1ce2c5ea21a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,5 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
+relocatable = true
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..d0a4b504d8e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,118 @@
+#!/usr/bin/perl
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality
+# This converts the isolation test resizable_shmem.spec into a TAP test
+
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+
+# Need to configure for resizable_shmem
+$node->init;
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->start;
+
+# Create extension
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+
+# Query string variables for reuse
+my $rss_usage_query = 'SELECT rss_shmem FROM resizable_shmem_usage();';
+my $alloc_size_query = "SELECT allocated_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem';";
+# Currently only one structure is resizable
+my $fixed_struct_query = "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' and allocated_size <> maximum_size;";
+
+my $page_size = $node->safe_psql('postgres', "SELECT resizable_shmem_pagesize();");
+
+# Create background sessions for testing
+my $session1 = $node->background_psql('postgres');
+my $session2 = $node->background_psql('postgres');
+
+my $num_entries = 25 * 1024 * 1024; # Initial number of entries in resizable shared memory
+my $max_entries = 100 * 1024 * 1024; # Maximum number of entries allowed
+my $entry_size = 4; # each entry is int32
+my $prev_shmem_usage1 = $session1->query_safe($rss_usage_query, verbose => 0);
+my $prev_shmem_usage2 = $session2->query_safe($rss_usage_query, verbose => 0);
+my $prev_alloc_size;
+
+# We need to make sure that the changes to shared memory allocated are
+# proportionate to the changes in the resizable shared memory structure. But
+# there is no way to know the shared memory allocated at the given address in a
+# given process. We can only know the size of shared memory accessed by the a
+# given process. In case of PostgreSQL, that includes the memory allocated to
+# other shared memory structures as well. Instead, we just note the changes in
+# the function below to help in debugging overallocation issues.
+sub note_shmem_changes
+{
+ my ($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = @_;
+
+ my $shmem_usage1 = $session1->query_safe($rss_usage_query, verbose => 0);
+ my $shmem_usage2 = $session2->query_safe($rss_usage_query, verbose => 0);
+ my $alloc_size = $node->safe_psql('postgres', $alloc_size_query, verbose => 0);
+
+ note "changes in allocated size: " . ($alloc_size - $prev_alloc_size);
+ note "Session 1: changes in rss_shmem usage: " . ($shmem_usage1 - $prev_shmem_usage1);
+ note "Session 1: difference in rss_shmem change and allocated size change: " . (($shmem_usage1 - $prev_shmem_usage1) - ($alloc_size - $prev_alloc_size));
+ note "Session 2: changes in rss_shmem usage: " . ($shmem_usage2 - $prev_shmem_usage2);
+ note "Session 2: difference in rss_shmem change and allocated size change: " . (($shmem_usage2 - $prev_shmem_usage2) - ($alloc_size - $prev_alloc_size));
+
+ return ($shmem_usage1, $shmem_usage2, $alloc_size);
+}
+
+my $value = 100;
+# Write and read the initial set of entries.
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data read after write successful');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, 0);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'initial fixed sized structures');
+
+# Resize to maximum
+my $old_num_entries = $num_entries;
+$num_entries = $max_entries;
+$session1->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old data after resize should still be intact
+is($session1->query_safe("SELECT resizable_shmem_read($old_num_entries, $value);", verbose => 0), 't', 'initial data readable after resize');
+$value = 500;
+$session2->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'enlarged area data read successful');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures after resize to maximum');
+
+# Shrink smaller size
+$old_num_entries = $num_entries;
+$num_entries = 75 * 1024 * 1024;
+$session2->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old values should remain intact in the shrunk area
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data readable after shrinking');
+$value = 999;
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'new data readable in shrunken area');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures after shrinking');
+
+# Resize to the same size
+$session2->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old values should remain intact in the shrunk area
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data readable after shrinking');
+$value = 1999;
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'new data readable in shrunken area');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures at the end');
+
+# Test resize failure (attempt to resize beyond max - should fail)
+my ($ret, $stdout, $stderr) = $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ok($ret != 0 || $stderr =~ /ERROR/, 'Resize beyond maximum fails');
+
+# Cleanup sessions
+$session1->quit;
+$session2->quit;
+
+# Cleanup
+$node->stop;
+
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f4ee2bd7459..0942cc2f771 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,9 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ reserved_size
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-12 17:03 Robert Haas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-12 17:03 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Mar 6, 2026 at 9:13 AM Heikki Linnakangas <[email protected]> wrote:
> 1. _PG_init() gets called early at postmaster startup, if the library is
> in shared_preload_libraries. If it's not in shared_preload_libraries, it
> gets called whenever the module is loaded.
>
> 2. The library can install a shmem_request_hook, which gets called early
> at postmaster startup, but after initializing the MaxBackends GUC. It
> only gets called when the library is loaded via shared_preload_libraries.
>
> 3. The library can install a shmem_startup_hook. It gets called later at
> postmaster startup, after the shared memory segment has been allocated.
> In EXEC_BACKEND mode it also gets called at backend startup. It does not
> get called if the library is not listed in shared_preload_libraries.
>
> None of these is quite the right moment to call the new
> ShmemRegisterStruct() function. _PG_init() is too early if the extension
> needs MaxBackends for sizing the shared memory area. shmem_request_hook
> is otherwise good, but in EXEC_BACKEND mode, the ShmemRegisterStruct()
> function needs to also be called backend startup and shmem_request_hook
> is not called at backend startup. shmem_startup_hook() is too late.
I believe that the design goal of
4f2400cb3f10aa79f99fba680c198237da28dd38 was to make it so that people
who had working extensions already didn't need to change their code,
but those for whom the restrictions of doing things in _PG_init were
annoying would have a workable alternative. I think that's a pretty
good goal, although I don't feel we absolutely have to stick to it. It
could easily be worth breaking that if we get something cool out of
it. But is there a reason we can't make it so that this new mechanism
can be used either from _PG_init() or shmem_startup_hook()? (I assume
there is or you likely would have done it already, but it's not clear
to me what that reason is.)
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-12 18:56 Robert Haas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-12 18:56 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Thu, Mar 12, 2026 at 2:41 PM Heikki Linnakangas <[email protected]> wrote:
> shmem_startup_hook() is too late. The shmem structs need to be
> registered at postmaster startup before the shmem segment is allocated,
> so that we can calculate the total size needed.
Sorry, I meant shmem_request_hook.
> I'm currently leaning towards _PG_init(), except for allocations that
> depend on MaxBackends. For those, you can install a shmem_request_hook
> that sets the size in the descriptor. In other words, you can leave the
> 'size' as empty in _PG_init(), but set it later in the shmem_request_hook.
Why can't you just do the whole thing later?
> Another option is to add a new bespoken callback in the descriptor for
> such size adjustments, which would get called at the same time as
> shmem_request_hook. That might be a little more ergonomic, there would
> no longer be any need for extensions to use the old
> shmem_request/startup_hooks with the new ShmemRegisterStruct() mechanism.
Yeah, worth considering.
> Except that you'd still need them for RequestNamedLWLockTranche(). I
> wonder if we should recommend extensions to embed the LWLock struct into
> their shared memory struct and use the LWLockInitialize() and
> LWLockNewTrancheId() functions instead. That fits the new
> ShmemRegisterStruct() API a little better than RequestNamedLWLockTranche().
Yeah, I think RequestNamedLWLockTranche() might be fine if you just
need LWLocks, but if you need a bunch of resources, putting them all
into the same chunk of memory seems cleaner.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-12 19:21 Heikki Linnakangas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-12 19:21 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 12/03/2026 20:56, Robert Haas wrote:
> On Thu, Mar 12, 2026 at 2:41 PM Heikki Linnakangas <[email protected]> wrote:
>> shmem_startup_hook() is too late. The shmem structs need to be
>> registered at postmaster startup before the shmem segment is allocated,
>> so that we can calculate the total size needed.
>
> Sorry, I meant shmem_request_hook.
Ah ok
>> I'm currently leaning towards _PG_init(), except for allocations that
>> depend on MaxBackends. For those, you can install a shmem_request_hook
>> that sets the size in the descriptor. In other words, you can leave the
>> 'size' as empty in _PG_init(), but set it later in the shmem_request_hook.
>
> Why can't you just do the whole thing later?
shmem_request_hook won't work in EXEC_BACKEND mode, because in
EXEC_BACKEND mode, ShmemRegisterStruct() also needs to be called at
backend startup.
One of my design goals is to avoid EXEC_BACKEND specific steps so that
if you write your extension oblivious to EXEC_BACKEND mode, it will
still usually work with EXEC_BACKEND. For example, if it was necessary
to call a separate AttachShmem() function for every shmem struct in
EXEC_BACKEND mode, but which was not needed on Unix, that would be bad.
>> Except that you'd still need them for RequestNamedLWLockTranche(). I
>> wonder if we should recommend extensions to embed the LWLock struct into
>> their shared memory struct and use the LWLockInitialize() and
>> LWLockNewTrancheId() functions instead. That fits the new
>> ShmemRegisterStruct() API a little better than RequestNamedLWLockTranche().
>
> Yeah, I think RequestNamedLWLockTranche() might be fine if you just
> need LWLocks, but if you need a bunch of resources, putting them all
> into the same chunk of memory seems cleaner.
Agreed. Then again, how often do you need just a LWLock (or multiple
LWLocks)? Surely you have a struct you want to protect with the lock. I
guess having shmem hash table but no struct would be pretty common, though.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-12 20:05 Robert Haas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-12 20:05 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Thu, Mar 12, 2026 at 3:21 PM Heikki Linnakangas <[email protected]> wrote:
> >> I'm currently leaning towards _PG_init(), except for allocations that
> >> depend on MaxBackends. For those, you can install a shmem_request_hook
> >> that sets the size in the descriptor. In other words, you can leave the
> >> 'size' as empty in _PG_init(), but set it later in the shmem_request_hook.
> >
> > Why can't you just do the whole thing later?
>
> shmem_request_hook won't work in EXEC_BACKEND mode, because in
> EXEC_BACKEND mode, ShmemRegisterStruct() also needs to be called at
> backend startup.
>
> One of my design goals is to avoid EXEC_BACKEND specific steps so that
> if you write your extension oblivious to EXEC_BACKEND mode, it will
> still usually work with EXEC_BACKEND. For example, if it was necessary
> to call a separate AttachShmem() function for every shmem struct in
> EXEC_BACKEND mode, but which was not needed on Unix, that would be bad.
That's *definitely* a good goal. A less important but still valuable
goal is to maximize the notational simplicity of the mechanism. Your
callback idea is elegant in theory but in practice it seems like it
might make it harder for people to get started quickly on a new
module, and having to create the object in one place and then fill in
the size in another sort of has the same problem. I don't really know
what to do about that, but it's something to think about. The
complexity of getting the details right is annoyingly high in this
area.
> > Yeah, I think RequestNamedLWLockTranche() might be fine if you just
> > need LWLocks, but if you need a bunch of resources, putting them all
> > into the same chunk of memory seems cleaner.
>
> Agreed. Then again, how often do you need just a LWLock (or multiple
> LWLocks)? Surely you have a struct you want to protect with the lock. I
> guess having shmem hash table but no struct would be pretty common, though.
Yeah, we've developed an annoying number of different ways to do this
stuff. I don't entirely know how to fix that.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-13 11:41 Heikki Linnakangas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-13 11:41 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 12/03/2026 22:05, Robert Haas wrote:
> On Thu, Mar 12, 2026 at 3:21 PM Heikki Linnakangas <[email protected]> wrote:
>>>> I'm currently leaning towards _PG_init(), except for allocations that
>>>> depend on MaxBackends. For those, you can install a shmem_request_hook
>>>> that sets the size in the descriptor. In other words, you can leave the
>>>> 'size' as empty in _PG_init(), but set it later in the shmem_request_hook.
>>>
>>> Why can't you just do the whole thing later?
>>
>> shmem_request_hook won't work in EXEC_BACKEND mode, because in
>> EXEC_BACKEND mode, ShmemRegisterStruct() also needs to be called at
>> backend startup.
>>
>> One of my design goals is to avoid EXEC_BACKEND specific steps so that
>> if you write your extension oblivious to EXEC_BACKEND mode, it will
>> still usually work with EXEC_BACKEND. For example, if it was necessary
>> to call a separate AttachShmem() function for every shmem struct in
>> EXEC_BACKEND mode, but which was not needed on Unix, that would be bad.
>
> That's *definitely* a good goal. A less important but still valuable
> goal is to maximize the notational simplicity of the mechanism. Your
> callback idea is elegant in theory but in practice it seems like it
> might make it harder for people to get started quickly on a new
> module, and having to create the object in one place and then fill in
> the size in another sort of has the same problem. I don't really know
> what to do about that, but it's something to think about. The
> complexity of getting the details right is annoyingly high in this
> area.
Yeah. IMHO the existing shmem_request/startup_hook mechanism is pretty
awkward too, and in most cases, the new mechanism is more convenient. It
might be slightly less convenient for some things, but mostly it's
better. Would you agree with that, or do you actually like the old hooks
and ShmemInitStruct() better?
One such wrinkle with ShmemRegisterStruct() in the patch now is that
it's harder to do initialization that touches multiple structs or hash
tables. Currently the callbacks are called in the same order that the
structs are registered, so you can do all the initialization in the last
struct's callback. The single pair of shmem_request/startup_hooks per
module was more clear in that aspect. Fortunately, that kind of
cross-struct dependencies are pretty rare. So I think it's fine. (The
order that the callbacks are called needs be documented explicitly though).
If we want to improve on that, one idea would be to introduce a
ShmemRegisterCallbacks() function to register callbacks that are not
tied to any particular struct and are called after all the per-struct
callbacks.
>>> Yeah, I think RequestNamedLWLockTranche() might be fine if you just
>>> need LWLocks, but if you need a bunch of resources, putting them all
>>> into the same chunk of memory seems cleaner.
>>
>> Agreed. Then again, how often do you need just a LWLock (or multiple
>> LWLocks)? Surely you have a struct you want to protect with the lock. I
>> guess having shmem hash table but no struct would be pretty common, though.
>
> Yeah, we've developed an annoying number of different ways to do this
> stuff. I don't entirely know how to fix that.
Here's a new version that doubles down on the
LWLockNewTrancheId+LWLockInitialize method, by changing the example in
the docs, and contrib/pg_stat_statements, to use that method.
RequestNamedLWLockTranche() still works, there are no changes to it,
it's just not as convenient to use with ShmemRegisterStruct(). This has
the advantage that we don't introduce yet another way of allocating LWLocks.
P.S. Thanks for chiming in on this. It's pretty subjective how natural a
new API like this feels like, so opinions are very welcome.
- Heikki
Attachments:
[text/x-patch] v3-0001-Introduce-a-new-mechanism-for-registering-shared-.patch (51.5K, 2-v3-0001-Introduce-a-new-mechanism-for-registering-shared-.patch)
download | inline diff:
From 8365fcb792225b76b850e3d1f15878a320fd1e30 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 11:21:37 +0200
Subject: [PATCH v3 1/3] Introduce a new mechanism for registering shared
memory areas
Each shared memory area is registered with a "descriptor struct" that
contains the parameters like name and size. A struct makes it easier
to add optional fields in the future; the additional fields can just
be left as zeros.
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register the subsystems shared memory areas. The
registration includes the size, which replaces the
[Subsystem]ShmemSize() calls, and a pointer to an initialization
callback function, which replaces the [Subsystem]ShmemInit()
calls. This is more ergonomic, as you only need to calculate the size
once, when you register the struct.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 121 ++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 5 +
src/backend/storage/ipc/ipci.c | 57 +-
src/backend/storage/ipc/shmem.c | 821 +++++++++++++++++-------
src/backend/tcop/postgres.c | 3 +
src/include/storage/ipc.h | 1 +
src/include/storage/shmem.h | 136 +++-
10 files changed, 829 insertions(+), 325 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index e5fe423fc61..a0baed339fe 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRegisterStruct()</literal> or
+ <literal>ShmemRegisterHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..57320d323fc 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,59 +3628,87 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
-<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
+ The shared library should register the shared memory allocation in
+ its <function>_PG_init</function> function. Here is an example:
<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_init(void *arg);
+
+static ShmemStructDesc MyShmemDesc = {
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .init_fn = my_shmem_init,
+ .ptr = &MyShmem,
+};
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ ShmemRegisterStruct(&MyShmemDesc);
}
-LWLockRelease(AddinShmemInitLock);
+
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
+{
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock);
+
+ ... initialize the rest of MyShmem fields ...
+}
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>ShmemRegisterStruct()</function> call doesn't immediately
+ allocate or initialize the memory, it merely registers the space to be
+ allocated later in the startup sequence. If the size of the allocation
+ depends on <varname>MaxBackends</varname> or other variables that are
+ not yet initialized when <function>_PG_init()</function> is called, the
+ size can still be adjusted later by registering a
+ <literal>shmem_request_hook</literal> and changing the descriptor there.
+ When the memory is allocated, the registered
+ <function>init_fn</function> callback is called to initialize it.
+ </para>
+ <para>
+ The <function>init_fn()</function> callback is normally called at
+ postmaster startup, when no other processes are running yet and no
+ locking is required. However, if shared memory area is registered after
+ system start, e.g. in an extension that is not in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>,
+ <function>ShmemRegisterStruct()</function> will immediately call
+ the <function>init_fn</function> callback. In that case, it holds a
+ lock internally that prevents concurrent shmem allocations.
+ </para>
+ <para>
+ On Windows, the <function>attach_fn</function> callback is additionally
+ called at every backend startup. It can be used for initializing
+ additional per-backend state related to the shared memory area that is
+ inherited via <function>fork()</function> on other systems. On other
+ platforms, the <function>attach_fn</function> callback is only called
+ for structs that are registered after system startup.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
@@ -3691,8 +3719,7 @@ LWLockRelease(AddinShmemInitLock);
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 17118f2fe76..d9d8bb6a5fd 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -369,6 +369,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ RegisterShmemStructs();
+
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 30357845729..fecae827e5b 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -677,7 +678,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ RegisterShmemStructs();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..702c646f0d7 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -958,6 +958,9 @@ PostmasterMain(int argc, char *argv[])
*/
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
@@ -3236,6 +3239,8 @@ PostmasterStateMachine(void)
LocalProcessControlFile(true);
/* re-create shared memory and semaphores */
+ ResetShmemAllocator();
+ RegisterShmemStructs();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a4785daf1e5..7c653e27370 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -99,10 +99,12 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemRegisteredSize());
+
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
+
+ /* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
@@ -218,6 +220,10 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
/* Initialize subsystems */
CreateOrAttachShmemStructs();
@@ -231,6 +237,22 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a change to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterShmemStructs(void)
+{
+ /*
+ * TODO: Not used in any built-in subsystems yet. In the future, most of
+ * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
+ * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
+ * into the subsystems from here.
+ */
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
@@ -249,16 +271,27 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Now initialize LWLocks, which do shared memory allocation and are
- * needed for InitShmemIndex.
- */
- CreateLWLocks();
-
- /*
- * Set up shmem.c index hashtable
- */
- InitShmemIndex();
+#ifdef EXEC_BACKEND
+ if (IsUnderPostmaster)
+ {
+ /*
+ * ShmemAttachRegistered() uses LWLocks. Fortunately, LWLocks don't
+ * need any special attaching.
+ */
+ ShmemAttachRegistered();
+ }
+ else
+#endif
+ {
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function
+ * use LWLocks. (Nothing else can be running during startup though,
+ * so it's pretty useless for them to do any locking, but we still
+ * allow it.)
+ */
+ CreateLWLocks();
+ ShmemInitRegistered();
+ }
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 55e4a5421de..2618cb927c6 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,48 +19,95 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size, but
- * their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted. However,
- * if one hash table grows very large and then shrinks, its space
- * cannot be redistributed to other tables. We could build a simple
- * hash bucket garbage collector if need be. Right now, it seems
- * unnecessary.
+ * There are two kinds of shared memory data structures: fixed-size structures
+ * and hash tables. Fixed-size structures contain things like global
+ * variables for a module and should never be allocated after the shared
+ * memory initialization phase. Hash tables have a fixed maximum size, but
+ * their actual size can vary dynamically. When entries are added to the
+ * table, more space is allocated. Each shared data structure and hash has a
+ * string name to identify it, specified in the descriptor when its
+ * registered.
+ *
+ * Shared memory structures (and hash table entries) are mapped to the address
+ * in each backend process, so you can safely use pointers to other parts of
+ * shared memory in the shared memory structures.
+ *
+ * Shared memory can never be freed, once allocated. Each hash table has its
+ * own free list, so hash buckets can be reused when an item is deleted.
+ * However, if one hash table grows very large and then shrinks, its space
+ * cannot be redistributed to other tables. We could build a simple hash
+ * bucket garbage collector if need be. Right now, it seems unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate a shared memory area, fill in the name, size, and any other
+ * options in ShmemStructDesc, and call ShmemRegisterStruct(). Leave any
+ * unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_init(void *arg);
+ *
+ * static ShmemStructDesc MyShmemDesc = {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .init_fn = my_shmem_init,
+ * .ptr = &MyShmem,
+ * };
+ *
+ * In the subsystem's initialization code, or in _PG_init() or shmem_request_hook
+ * in extensions, call ShmemRegisterStruct():
+ *
+ * ShmemRegisterStruct(&MyShmemDesc)
+ *
+ *
+ * Lifecycle
+ * ---------
+ *
+ * RegisterShmemStructs() is called at postmaster startup, before deciding the
+ * size of the global shared memory segment. To add a new shared memory area,
+ * call its Register function from RegisterShmemStructs().
+ *
+ * Once all the registrations have been done, postmaster calls
+ * ShmemRegisteredSize() to add up the sizes of all the registered areas
+ *
+ * After allocating the shared memory segment, postmaster calls
+ * ShmemInitRegistered(), which calls the init_fn callbacks of each registered
+ * area, in the order that they were registered.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls RegisterShmemStructs(), followed by
+ * ShmemAttachRegistered(), which re-establishes the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registring shmem
+ * areas. It pre-dates the ShmemRegisterStruct()/ShmemRegisterHash()
+ * functions, and should not be used in new code, but as of this writing it is
+ * still widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks; the caller of ShmemInitStruct() must check the
+ * return status of ShmemInitStruct() and initialize the struct if it was not
+ * previously initialized.
+ *
+ *
+ * More legacy: Calling ShmemAlloc() directly
+ * ------------------------------------------
*/
#include "postgres.h"
@@ -76,6 +123,24 @@
#include "storage/spin.h"
#include "utils/builtins.h"
+/*
+ * Array of registered shared memory areas.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time, and be
+ * inherited by all the child processes. Extensions may register additional
+ * areas after startup, but only areas registered at postmaster startup are
+ * included in the estimate for the total memory needed for shared memory. If
+ * any non-trivial allocations are made after startup, there might not be
+ * enough shared memory available.
+ */
+static ShmemStructDesc **registered_shmem_areas;
+static int num_registered_shmem_areas = 0;
+static int max_registered_shmem_areas = 0; /* allocated size of the array */
+
+/* estimated size of registered_shmem_areas (not a hard limit) */
+#define INITIAL_REGISTRY_SIZE (64)
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -95,6 +160,9 @@ typedef struct ShmemAllocatorData
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
/* shared memory global variables */
static PGShmemHeader *ShmemSegHdr; /* shared mem segment header */
@@ -103,24 +171,328 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for allocations
+ * after postmaster startup (not a hard limit)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
+static bool shmem_initialized = false;
+
+/*
+ * ShmemRegisterStruct() --- register a shared memory struct
+ *
+ * Subsystems call this to register their shared memory needs. That should be
+ * done early in postmaster startup, before the shared memory segment has been
+ * created, so that the size can be included in the estimate for total amount
+ * of shared memory needed. We set aside a small amount of memory for
+ * allocations that happen later, for the benefit of non-preloaded extensions,
+ * but that should not be relied upon.
+ *
+ * In core subsystems, each subsystem's registration functions is called from
+ * RegisterShmemStructs(). In extensions, this should be called from the
+ * _PG_init() initializer or the 'shmem_request_hook'. In EXEC_BACKEND mode,
+ * this also needs to be called in each child process, to reattach and set the
+ * pointer to the shared memory area, usually in a global variable. Calling
+ * this from the _PG_init() initializer or the 'shmem_request_hook' takes
+ * care of that too.
+ *
+ * Returns true if the struct was already initialized in shared memory, and
+ * we merely attached to it.
+ */
+bool
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+ bool found;
+
+ /* Check that it's not already registered in this process */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *existing = registered_shmem_areas[i];
+
+ if (strcmp(existing->name, desc->name) == 0)
+ elog(ERROR, "shared memory struct \"%s\" is already registered",
+ desc->name);
+ }
+
+ /* desc->ptr can be non-NULL when re-initializing after crash */
+ if (desc->ptr)
+ *desc->ptr = NULL;
+
+ /* Add the descriptor to the array, growing the array if needed */
+ if (num_registered_shmem_areas == max_registered_shmem_areas)
+ {
+ int new_size;
+
+ if (registered_shmem_areas)
+ {
+ new_size = max_registered_shmem_areas * 2;
+ registered_shmem_areas = repalloc(registered_shmem_areas,
+ new_size * sizeof(ShmemStructDesc *));
+ }
+ else
+ {
+ new_size = INITIAL_REGISTRY_SIZE;
+ registered_shmem_areas = MemoryContextAlloc(TopMemoryContext,
+ new_size * sizeof(ShmemStructDesc *));
+ }
+ max_registered_shmem_areas = new_size;
+ }
+ registered_shmem_areas[num_registered_shmem_areas++] = desc;
+
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ if (shmem_initialized)
+ {
+ ShmemIndexEnt *index_entry;
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (index_entry->size != desc->size)
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+ else
+ {
+ /* This is the first time. Initialize it like ShmemInitRegistered() would */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+
+ /* XXX: if this errors out, the areas is left in a half-initialized state */
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+ }
+ else
+ found = false;
+
+ return found;
+}
+
+/*
+ * ShmemRegisteredSize() --- estimate the total size of all pre-registered
+ * shared memory structures.
+ *
+ * This runs once at postmaster startup, before the shared memory segment has
+ * been created.
+ */
+size_t
+ShmemRegisteredSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the registered areas */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+
+ size = add_size(size, desc->size);
+ size = add_size(size, desc->extra_size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRegistered() --- allocate and initialize pre-registered shared
+ * memory structures.
+ *
+ * This runs once at postmaster startup, after the shared memory segment has
+ * been created.
+ */
+void
+ShmemInitRegistered(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(!shmem_initialized);
+
+ /*
+ * Initialize all the registered memory areas. There are no concurrent
+ * processes yet, so no need for locking.
+ */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+ size_t allocated_size;
+ void *structPtr;
+ bool found;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ elog(ERROR, "shmem struct \"%s\" is already initialized", desc->name);
+
+ /* allocate and initialize it */
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ *(desc->ptr) = structPtr;
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ shmem_initialized = true;
+}
+
+/*
+ * Call the attach_fn callbacks of all registered
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+
+ /* XXX: document this locking with attach_fn */
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+ bool found;
+ ShmemIndexEnt *result;
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_FIND, &found);
+ if (!found)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+
+ if (desc->ptr)
+ *desc->ptr = result->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_initialized = true;
+}
+#endif
+
+void
+ResetShmemAllocator(void)
+{
+ shmem_initialized = false;
+ num_registered_shmem_areas = 0;
+
+ /* FIXME: this leaks the allocations in TopMemoryContext */
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
+ Size offset;
+ int hash_size;
+ HASHCTL info;
+ int hash_flags;
+ size_t size;
+
+ Assert(!shmem_initialized);
Assert(seghdr != NULL);
/*
@@ -134,41 +506,47 @@ InitShmemAllocator(PGShmemHeader *seghdr)
ShmemBase = seghdr;
ShmemEnd = (char *) ShmemBase + seghdr->totalsize;
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
- if (IsUnderPostmaster)
- {
- PGShmemHeader *shmhdr = ShmemSegHdr;
-
- ShmemAllocator = (ShmemAllocatorData *) ((char *) shmhdr + shmhdr->content_offset);
- ShmemLock = &ShmemAllocator->shmem_lock;
- }
- else
- {
- Size offset;
-
- /*
- * Allocations after this point should go through ShmemAlloc, which
- * expects to allocate everything on cache line boundaries. Make sure
- * the first allocation begins on a cache line boundary.
- */
- offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
- if (offset > seghdr->totalsize)
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("out of shared memory (%zu bytes requested)",
- offset)));
+ /*
+ * Allocations after this point should go through ShmemAlloc, which
+ * expects to allocate everything on cache line boundaries. Make sure
+ * the first allocation begins on a cache line boundary.
+ */
+ offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
+ if (offset > seghdr->totalsize)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of shared memory (%zu bytes requested)",
+ offset)));
- ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ ShmemLock = &ShmemAllocator->shmem_lock;
+ if (!IsUnderPostmaster)
+ {
SpinLockInit(&ShmemAllocator->shmem_lock);
- ShmemLock = &ShmemAllocator->shmem_lock;
ShmemAllocator->free_offset = offset;
- /* ShmemIndex can't be set up yet (need LWLocks first) */
- ShmemAllocator->index = NULL;
- ShmemIndex = (HTAB *) NULL;
}
+
+ /*
+ * Create (or attach to) the shared memory index of shmem areas.
+ */
+ hash_size = num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE;
+
+ info.keysize = SHMEM_INDEX_KEYSIZE;
+ info.entrysize = sizeof(ShmemIndexEnt);
+ info.dsize = info.max_dsize = hash_select_dirsize(hash_size);
+ info.alloc = ShmemAllocNoError;
+ hash_flags = HASH_ELEM | HASH_STRINGS | HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ if (!IsUnderPostmaster)
+ {
+ size = hash_get_shared_size(&info, hash_flags);
+ ShmemAllocator->index = (HASHHDR *) ShmemAlloc(size);
+ }
+ else
+ hash_flags |= HASH_ATTACH;
+ info.hctl = ShmemAllocator->index;
+ ShmemIndex = hash_create("ShmemIndex", hash_size, &info, hash_flags);
+ Assert(ShmemIndex != NULL);
}
/*
@@ -268,67 +646,18 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * InitShmemIndex() --- set up or attach to shmem index table.
- */
-void
-InitShmemIndex(void)
-{
- HASHCTL info;
-
- /*
- * Create the shared memory shmem index.
- *
- * Since ShmemInitHash calls ShmemInitStruct, which expects the ShmemIndex
- * hashtable to exist already, we have a bit of a circularity problem in
- * initializing the ShmemIndex itself. The special "ShmemIndex" hash
- * table name will tell ShmemInitStruct to fake it.
- */
- info.keysize = SHMEM_INDEX_KEYSIZE;
- info.entrysize = sizeof(ShmemIndexEnt);
-
- ShmemIndex = ShmemInitHash("ShmemIndex",
- SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
-}
-
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * max_size is the estimated maximum number of hashtable entries. This is
- * not a hard limit, but the access efficiency will degrade if it is
- * exceeded substantially (since it's used to compute directory size and
- * the hash table buckets will get overfull).
- *
- * init_size is the number of hashtable entries to preallocate. For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
+ * ShmemRegisterHash -- Register a shared memory hash table.
*
* *infoP and hash_flags must specify at least the entry sizes and key
* comparison semantics (see hash_create()). Flag bits and values specific
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
*/
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 init_size, /* initial table size */
- int64 max_size, /* max size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
+bool
+ShmemRegisterHash(ShmemHashDesc *desc, /* configuration */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
{
- bool found;
- void *location;
-
/*
* Hash tables allocated in shared memory have a fixed directory; it can't
* grow or other backends wouldn't be able to find it. So, make sure we
@@ -336,145 +665,62 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
*
* The shared memory allocator must be specified too.
*/
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
+ infoP->dsize = infoP->max_dsize = hash_select_dirsize(desc->max_size);
infoP->alloc = ShmemAllocNoError;
hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
/* look it up in the shmem index */
- location = ShmemInitStruct(name,
- hash_get_shared_size(infoP, hash_flags),
- &found);
+ memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+ desc->base_desc.name = desc->name;
+ desc->base_desc.size = hash_get_shared_size(infoP, hash_flags);
+ desc->base_desc.init_fn = shmem_hash_init;
+ desc->base_desc.init_fn_arg = desc;
+ desc->base_desc.attach_fn = shmem_hash_attach;
+ desc->base_desc.attach_fn_arg = desc;
/*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
+ * We need a stable pointer to hold the pointer to the shared memory. Use
+ * the one passed in the descriptor now. It will be replaced with the hash
+ * table header by init or attach function.
*/
- if (found)
- hash_flags |= HASH_ATTACH;
+ desc->base_desc.ptr = (void **) desc->ptr;
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
+ desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
+
+ desc->hash_flags = hash_flags;
+ desc->infoP = MemoryContextAlloc(TopMemoryContext, sizeof(HASHCTL));
+ memcpy(desc->infoP, infoP, sizeof(HASHCTL));
- return hash_create(name, init_size, infoP, hash_flags);
+ return ShmemRegisterStruct(&desc->base_desc);
}
-/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
- *
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
- *
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+static void
+shmem_hash_init(void *arg)
{
- ShmemIndexEnt *result;
- void *structPtr;
-
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
- if (!ShmemIndex)
- {
- /* Must be trying to create/attach to ShmemIndex itself */
- Assert(strcmp(name, "ShmemIndex") == 0);
-
- if (IsUnderPostmaster)
- {
- /* Must be initializing a (non-standalone) backend */
- Assert(ShmemAllocator->index != NULL);
- structPtr = ShmemAllocator->index;
- *foundPtr = true;
- }
- else
- {
- /*
- * If the shmem index doesn't exist, we are bootstrapping: we must
- * be trying to init the shmem index itself.
- *
- * Notice that the ShmemIndexLock is released before the shmem
- * index has been initialized. This should be OK because no other
- * process can be accessing shared memory yet.
- */
- Assert(ShmemAllocator->index == NULL);
- structPtr = ShmemAlloc(size);
- ShmemAllocator->index = structPtr;
- *foundPtr = false;
- }
- LWLockRelease(ShmemIndexLock);
- return structPtr;
- }
-
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
-
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
-
- if (*foundPtr)
- {
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
+ /* Pass location of hashtable header to hash_create */
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
- }
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
- LWLockRelease(ShmemIndexLock);
+static void
+shmem_hash_attach(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
- Assert(ShmemAddrIsValid(structPtr));
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ /* Pass location of hashtable header to hash_create */
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- return structPtr;
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
}
-
/*
* Add two Size values, checking for overflow
*/
@@ -761,3 +1007,82 @@ pg_numa_available(PG_FUNCTION_ARGS)
{
PG_RETURN_BOOL(pg_numa_init() != -1);
}
+
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ ShmemStructDesc *desc;
+
+ Assert(shmem_initialized);
+
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemStructDesc) + sizeof(void *));
+ desc->name = name;
+ desc->size = size;
+ desc->ptr = (void *) (((char *) desc) + sizeof(ShmemStructDesc));
+
+ *foundPtr = ShmemRegisterStruct(desc);
+ Assert(*desc->ptr != NULL);
+ return *desc->ptr;
+}
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterHash() in new code!
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ ShmemHashDesc *desc;
+
+ Assert(shmem_initialized);
+
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemHashDesc) + sizeof(HTAB *));
+ desc->name = name;
+ desc->init_size = init_size;
+ desc->max_size = max_size;
+ desc->ptr = (HTAB **) (((char *) desc) + sizeof(ShmemHashDesc));
+
+ ShmemRegisterHash(desc, infoP, hash_flags);
+ return *desc->ptr;
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..fa074c419a8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4162,6 +4162,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* Initialize size of fast-path lock cache. */
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterShmemStructs(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..ea1884a8778 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,19 +24,138 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * ShmemStructDesc describes a named area or struct in shared memory.
+ *
+ * Shared memory is reserved and allocated in a few phases at postmaster
+ * startup, and in EXEC_BACKEND mode, there's some extra work done to "attach"
+ * to them at backend startup. ShmemStructDesc contains all the information
+ * needed to manage the lifecycle.
+ *
+ * 'name', 'size' and the callback functions are filled in by the
+ * ShmemRegisterStruct() caller. After registration, the shmem machinery
+ * reserves the memory for the area, sets *ptr to point to the allocation, and
+ * calls the callbacks at the right moments.
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ /* Size of the shared memory area */
+ size_t size;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup. 'init_fn_arg'
+ * is an opaque argument passed to the callback.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup. We
+ * never do that in core code, but extensions might.
+ */
+ ShmemInitCallback attach_fn;
+ void *attach_fn_arg;
+
+ /*
+ * Extra space to reserve for the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for named shared memory hash table.
+ *
+ * Similar to ShmemStructDesc, but describes a shared memory hash table. Each
+ * hash table is backed by an allocated area, described by 'base_desc', but if
+ * 'max_size' is greater than 'init_size', it can also grow beyond the initial
+ * allocated area by allocating more hash entries from the global unreserved
+ * space.
+ */
+typedef struct ShmemHashDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
+
+ /* Hash table options passed to hash_create() */
+ HASHCTL *infoP;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared hash table later.
+ */
+ HTAB **ptr;
+
+ /*
+ * Descriptor for the underlying "area". Callers of ShmemRegisterHash()
+ * do not need to touch this, it is filled in by ShmemRegisterHash() based
+ * on the hash table parameters.
+ */
+ ShmemStructDesc base_desc;
+} ShmemHashDesc;
/* shmem.c */
extern PGDLLIMPORT slock_t *ShmemLock;
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
-extern void InitShmemIndex(void);
+
+extern bool ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
+extern bool ShmemRegisterStruct(ShmemStructDesc *desc);
+
+/* legacy shmem allocation functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRegistered(void);
+#endif
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
@@ -45,19 +164,4 @@ extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* estimated size of the shmem index table (not a hard limit) */
-#define SHMEM_INDEX_SIZE (64)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
--
2.47.3
[text/x-patch] v3-0002-Convert-pg_stat_statements-to-use-the-new-interfa.patch (10.2K, 3-v3-0002-Convert-pg_stat_statements-to-use-the-new-interfa.patch)
download | inline diff:
From cf79eb29e3a56549d7880260e76c2d1295ccadd0 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 24 Feb 2026 15:45:51 +0200
Subject: [PATCH v3 2/3] Convert pg_stat_statements to use the new interface
---
.../pg_stat_statements/pg_stat_statements.c | 163 ++++++++----------
1 file changed, 69 insertions(+), 94 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 4a427533bd8..017387ae955 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -248,7 +248,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -258,13 +258,35 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc =
+{
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .init_fn = pgss_shmem_init,
+ .ptr = (void *) &pgss,
+};
+
+static ShmemHashDesc pgssSharedHashDesc =
+{
+ .name = "pg_stat_statements hash",
+ .init_size = 0, /* set from 'pgss_max' */
+ .max_size = 0, /* set from 'pgss_max' */
+ .ptr = &pgss_hash,
+};
+
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
@@ -274,10 +296,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -330,7 +348,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
@@ -365,7 +382,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -390,6 +406,8 @@ static int comp_location(const void *a, const void *b);
void
_PG_init(void)
{
+ HASHCTL info;
+
/*
* In order to create our shared memory area, we have to be loaded via
* shared_preload_libraries. If not, fall out without hooking into any of
@@ -470,11 +488,22 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs, including hash table
+ */
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+ info.keysize = sizeof(pgssHashKey);
+ info.entrysize = sizeof(pgssEntry);
+ pgssSharedHashDesc.init_size = pgss_max;
+ pgssSharedHashDesc.max_size = pgss_max;
+ ShmemRegisterHash(&pgssSharedHashDesc,
+ &info,
+ HASH_ELEM | HASH_BLOBS);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
@@ -493,31 +522,31 @@ _PG_init(void)
ProcessUtility_hook = pgss_ProcessUtility;
}
-/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
- */
static void
-pgss_shmem_request(void)
+pgss_shmem_init(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ int tranche_id;
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem_startup hook: Load any pre-existing statistics from file at
+ * postmaster startup. Also create and load the query-texts file, which is
+ * expected to exist (even if empty) while the module is enabled.
*/
static void
pgss_shmem_startup(void)
{
- bool found;
- HASHCTL info;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -530,54 +559,14 @@ pgss_shmem_startup(void)
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
+ if (IsUnderPostmaster)
+ return; /* nothing to do in backends */
/*
- * Create or attach to the shared memory state, including hash table
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
-
- /*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
- */
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
-
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
* Note: we don't bother with locks here, because there should be no other
@@ -1337,7 +1326,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1358,11 +1347,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1377,8 +1366,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1517,7 +1506,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1806,7 +1795,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2043,7 +2032,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
free(qbuffer);
}
@@ -2082,20 +2071,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2725,7 +2700,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2819,7 +2794,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v3-0003-Use-the-new-mechanism-in-a-few-core-subsystems.patch (34.5K, 4-v3-0003-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From df750303fe44343621956fc6e414240690bfcfca Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 13:00:34 +0200
Subject: [PATCH v3 3/3] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
---
src/backend/access/transam/varsup.c | 33 +++--
src/backend/storage/ipc/dsm.c | 45 ++++---
src/backend/storage/ipc/dsm_registry.c | 34 +++---
src/backend/storage/ipc/ipci.c | 37 +++---
src/backend/storage/ipc/pmsignal.c | 54 +++++----
src/backend/storage/ipc/procarray.c | 126 +++++++++----------
src/backend/storage/ipc/procsignal.c | 63 +++++-----
src/backend/storage/ipc/shmem.c | 4 +-
src/backend/storage/ipc/sinvaladt.c | 37 +++---
src/backend/storage/lmgr/proc.c | 162 +++++++++++++------------
src/include/access/transam.h | 10 +-
src/include/storage/dsm_registry.h | 3 +-
src/include/storage/pmsignal.h | 3 +-
src/include/storage/proc.h | 2 +-
src/include/storage/procarray.h | 3 +-
src/include/storage/procsignal.h | 3 +-
src/include/storage/sinvaladt.h | 3 +-
17 files changed, 312 insertions(+), 310 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..3dfda875e80 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,32 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemInit(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+ShmemStructDesc TransamVariablesShmemDesc = {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .init_fn = VarsupShmemInit,
+ .ptr = (void **) &TransamVariables,
+};
/*
* Initialization of shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
{
- return sizeof(TransamVariablesData);
+ ShmemRegisterStruct(&TransamVariablesShmemDesc);
}
-void
-VarsupShmemInit(void)
-{
- bool found;
+static void
+VarsupShmemInit(void *arg)
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+{
+ memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..73644ec3bbb 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -108,8 +108,17 @@ static inline bool is_main_region_dsm_handle(dsm_handle handle);
static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
+static void dsm_main_space_init(void *);
+
static void *dsm_main_space_begin = NULL;
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+ .name = "Preallocated DSM",
+ .size = 0, /* dynamic */
+ .init_fn = dsm_main_space_init,
+ .ptr = &dsm_main_space_begin,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -479,27 +488,29 @@ void
dsm_shmem_init(void)
{
size_t size = dsm_estimate_size();
- bool found;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
+
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 068c1577b12..cc20223b94b 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -54,8 +54,18 @@ typedef struct DSMRegistryCtxStruct
dshash_table_handle dshh;
} DSMRegistryCtxStruct;
+static void DSMRegistryCtxShmemInit(void *arg);
+
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .init_fn = DSMRegistryCtxShmemInit,
+ .ptr = (void **) &DSMRegistryCtx,
+};
+
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -113,27 +123,17 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7c653e27370..c1ee422349d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -102,15 +102,14 @@ CalculateShmemSize(void)
size = add_size(size, ShmemRegisteredSize());
size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
+
+ size = add_size(size, ShmemRegisteredSize());
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -120,11 +119,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -246,11 +241,18 @@ void
RegisterShmemStructs(void)
{
/*
- * TODO: Not used in any built-in subsystems yet. In the future, most of
- * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
- * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
- * into the subsystems from here.
+ * TODO: In the future, most of the calls *ShmemInit() calls in
+ * CreateOrAttachShmemStructs(), and *ShmemSize() calls in
+ * CalculateShmemSize() will be replaced by calls into the subsystems from
+ * here.
*/
+ DSMRegistryShmemRegister();
+ ProcGlobalShmemRegister();
+ VarsupShmemRegister();
+ ProcArrayShmemRegister();
+ SharedInvalShmemRegister();
+ PMSignalShmemRegister();
+ ProcSignalShmemRegister();
}
/*
@@ -294,13 +296,12 @@ CreateOrAttachShmemStructs(void)
}
dsm_shmem_init();
- DSMRegistryShmemInit();
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
+
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
CLOGShmemInit();
@@ -322,23 +323,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..c840d5a8fb8 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -80,9 +80,20 @@ struct PMSignalData
sig_atomic_t PMChildFlags[FLEXIBLE_ARRAY_MEMBER];
};
-/* PMSignalState pointer is valid in both postmaster and child processes */
+static void PMSignalShmemInit(void *);
+
+/*
+ * PMSignalState pointer is valid in both postmaster and child processes
+ */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* dynamic */
+ .init_fn = PMSignalShmemInit,
+ .ptr = (void **) &PMSignalState,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +134,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register our shared memory
*/
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
{
Size size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = offsetof(PMSignalData, PMChildFlags);
+ size = add_size(size, mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ PMSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&PMSignalShmemDesc);
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ /* initialize all flags to zeroes */
+ Assert(PMSignalState);
+ MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
@@ -291,6 +292,7 @@ RegisterPostmasterChildActive(void)
{
int slot = MyPMChildSlot;
+ Assert(PMSignalState);
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 0f913897acc..abf1519fcba 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -103,6 +103,19 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+ .name = "Proc Array",
+ .size = 0, /* dynamic */
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+ .ptr = (void **) &procArray,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,25 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+ .name = "KnownAssignedXids",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXids,
+};
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+ .name = "KnownAssignedXidsValid",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXidsValid,
+};
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,18 +401,19 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ /* Create or attach to the ProcArray shared structure */
+ ProcArrayShmemDesc.size =
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
+ PROCARRAY_MAXPROCS));
+ ShmemRegisterStruct(&ProcArrayShmemDesc);
/*
* During Hot Standby processing we have a data structure called
@@ -405,64 +433,38 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ KnownAssignedXidsShmemDesc.size =
+ mul_size(sizeof(TransactionId),
+ TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+ KnownAssignedXidsValidShmemDesc.size =
+ mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
}
-
- return size;
}
-/*
- * Initialize the shared PGPROC array during postmaster startup.
- */
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..ae1c2888b56 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -105,7 +105,18 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemInit(void *arg);
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+ .name = "ProcSignal",
+ .size = 0, /* dynamic */
+ .init_fn = ProcSignalShmemInit,
+ .ptr = (void **) &ProcSignal,
+};
+
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -113,51 +124,37 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ProcSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcSignalShmemDesc);
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 2618cb927c6..1e17663a0dd 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -241,7 +241,7 @@ ShmemRegisterStruct(ShmemStructDesc *desc)
}
/* desc->ptr can be non-NULL when re-initializing after crash */
- if (desc->ptr)
+ if (!IsUnderPostmaster && desc->ptr)
*desc->ptr = NULL;
/* Add the descriptor to the array, growing the array if needed */
@@ -319,7 +319,7 @@ ShmemRegisterStruct(ShmemStructDesc *desc)
if (desc->ptr)
*desc->ptr = index_entry->location;
- /* XXX: if this errors out, the areas is left in a half-initialized state */
+ /* XXX: if this errors out, the area is left in a half-initialized state */
if (desc->init_fn)
desc->init_fn(desc->init_fn_arg);
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..61c4e3aa93e 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,8 +203,17 @@ typedef struct SISeg
*/
#define NumProcStateSlots (MaxBackends + NUM_AUXILIARY_PROCS)
+static void SharedInvalShmemInit(void *arg);
+
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static ShmemStructDesc SharedInvalShmemDesc = {
+ .name = "shmInvalBuffer",
+ .size = 0, /* dynamic */
+ .init_fn = SharedInvalShmemInit,
+ .ptr = (void **) &shmInvalBuffer,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
{
Size size;
@@ -223,26 +233,17 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ /* Allocate space in shared memory */
+ SharedInvalShmemDesc.size = size;
+ ShmemRegisterStruct(&SharedInvalShmemDesc);
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, save size of procState array FIXME, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index d407725e602..c3a65650cde 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -67,11 +67,37 @@ bool log_lock_waits = true;
/* Pointer to this process's PGPROC struct, if any */
PGPROC *MyProc = NULL;
+static void ProcGlobalShmemInit(void *arg);
+
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .init_fn = ProcGlobalShmemInit,
+ .ptr = (void **) &ProcGlobal,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+ .name = "PGPROC structures",
+ .size = 0, /* dynamic */
+ .ptr = (void **) &tmpAllProcs,
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+ .name = "Fast-Path Lock Array",
+ .size = 0, /* dynamic */
+ .ptr = (void **) &tmpFastPathLockArray,
+};
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -81,24 +107,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -106,8 +114,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -123,25 +129,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -176,35 +163,68 @@ ProcGlobalSemas(void)
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
*
- * Note: this is NOT called by individual backends under a postmaster,
+ * Note: this is NOT called by individual backends under a postmaster, XXX
* not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
* pointers must be propagated specially for EXEC_BACKEND operation.
*/
void
-InitProcGlobal(void)
+ProcGlobalShmemRegister(void)
{
+ Size size = 0;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are
+ * six separate consumers: (1) normal backends, (2) autovacuum workers and
+ * special workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+
+ /* FIXME: the sizeofs look dangerous because ProcGlobal is not initialized yet */
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+ ProcGlobalAllProcsShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+ FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+ ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+ /*
+ * Create the ProcGlobal shared structure last. Its init callback
+ * initializes the others too.
+ */
+ ShmemRegisterStruct(&ProcGlobalShmemDesc);
+
+ if (IsUnderPostmaster)
+ {
+ Assert(ProcGlobal != NULL);
+ AuxiliaryProcs = &ProcGlobal->allProcs[MaxBackends];
+ }
+}
+
+static void
+ProcGlobalShmemInit(void *arg)
+{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
-
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
-
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -217,22 +237,10 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
- MemSet(ptr, 0, requestSize);
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
+ memset(ptr, 0, requestSize);
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -268,20 +276,14 @@ InitProcGlobal(void)
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..6e5a546f411 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,7 +15,9 @@
#define TRANSAM_H
#include "access/xlogdefs.h"
-
+#ifndef FRONTEND
+#include "storage/shmem.h"
+#endif
/* ----------------
* Special transaction ID values
@@ -330,7 +332,10 @@ TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2)
extern bool TransactionStartedDuringRecovery(void);
/* in transam/varsup.c */
+#ifndef FRONTEND
+extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
extern PGDLLIMPORT TransamVariablesData *TransamVariables;
+#endif
/*
* prototypes for functions in transam/transam.c
@@ -345,8 +350,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 3f89450c216..1d1e0881af2 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,7 +552,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index c5ab1574fe3..572516c4e21 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..d2344b1cbb3 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -63,8 +63,7 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-13 21:09 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 2 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-13 21:09 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; +Cc: pgsql-hackers; [email protected]
On 06/03/2026 16:12, Heikki Linnakangas wrote:
> Firstly, I'm not sure what to do with ShmemRegisterHash() and the
> 'HASHCTL *infoP' argument to it. I feel it'd be nicer if the HASHCTL was
> just part of the ShmemHashDesc struct, but I'm not sure if that fits all
> the callers. I'll have to try that out I guess.
I took a stab at that, and it turned out to be straightforward. I'm not
sure why I hesitated on that earlier.
Here's a new version with that change, and a ton of little comment
cleanups and such.
- Heikki
Attachments:
[text/x-patch] v4-0001-Introduce-a-new-mechanism-for-registering-shared-.patch (54.0K, 2-v4-0001-Introduce-a-new-mechanism-for-registering-shared-.patch)
download | inline diff:
From bf7f1141d02654c80bde1039aa8180f6a961d61c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 22:59:53 +0200
Subject: [PATCH v4 1/3] Introduce a new mechanism for registering shared
memory areas
Each shared memory area is registered with a "descriptor struct" that
contains parameters like name and size of the area. The descriptor
struct makes it easier to add optional fields in the future; the
additional fields can just be left as zeros.
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register the subsystem's shared memory needs. The
registration includes the size, which replaces the
[Subsystem]ShmemSize() calls, and a pointer to an initialization
callback function, which replaces the [Subsystem]ShmemInit()
calls. This is more ergonomic, as you only need to calculate the size
once, when you register the struct.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 126 ++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 5 +
src/backend/storage/ipc/ipci.c | 56 +-
src/backend/storage/ipc/shmem.c | 845 +++++++++++++++++-------
src/backend/tcop/postgres.c | 3 +
src/include/storage/ipc.h | 1 +
src/include/storage/shmem.h | 144 +++-
src/tools/pgindent/typedefs.list | 4 +-
11 files changed, 858 insertions(+), 336 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index e5fe423fc61..a0baed339fe 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRegisterStruct()</literal> or
+ <literal>ShmemRegisterHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..52439b87d3c 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,59 +3628,87 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
-<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
+ The shared library should register the shared memory allocation in
+ its <function>_PG_init</function> function. Here is an example:
<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_init(void *arg);
+
+static ShmemStructDesc MyShmemDesc = {
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .init_fn = my_shmem_init,
+ .ptr = (void **) &MyShmem,
+};
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ ShmemRegisterStruct(&MyShmemDesc);
}
-LWLockRelease(AddinShmemInitLock);
+
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
+{
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock);
+
+ ... initialize the rest of MyShmem fields ...
+}
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>ShmemRegisterStruct()</function> call doesn't immediately
+ allocate or initialize the memory, it merely registers the space to be
+ allocated later in the startup sequence. If the size of the allocation
+ depends on <varname>MaxBackends</varname> or other variables that are
+ not yet initialized when <function>_PG_init()</function> is called, the
+ size can still be adjusted later by registering a
+ <literal>shmem_request_hook</literal> and changing the descriptor there.
+ When the memory is allocated, the registered
+ <function>init_fn</function> callback is called to initialize it.
+ </para>
+ <para>
+ The <function>init_fn()</function> callback is normally called at
+ postmaster startup, when no other processes are running yet and no
+ locking is required. However, if a shared memory area is registered
+ after system start, e.g. in an extension that is not in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>,
+ <function>ShmemRegisterStruct()</function> will immediately call
+ the <function>init_fn</function> callback. In that case, it holds a
+ lock internally that prevents concurrent shmem allocations.
+ </para>
+ <para>
+ On Windows, the <function>attach_fn</function> callback is additionally
+ called at every backend startup. It can be used for initializing
+ additional per-backend state related to the shared memory area that is
+ inherited via <function>fork()</function> on other systems. On other
+ platforms, the <function>attach_fn</function> callback is only called
+ for structs that are registered after system startup.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
@@ -3691,8 +3719,7 @@ LWLockRelease(AddinShmemInitLock);
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3738,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 17118f2fe76..d9d8bb6a5fd 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -369,6 +369,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ RegisterShmemStructs();
+
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 30357845729..fecae827e5b 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -677,7 +678,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ RegisterShmemStructs();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..702c646f0d7 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -958,6 +958,9 @@ PostmasterMain(int argc, char *argv[])
*/
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
@@ -3236,6 +3239,8 @@ PostmasterStateMachine(void)
LocalProcessControlFile(true);
/* re-create shared memory and semaphores */
+ ResetShmemAllocator();
+ RegisterShmemStructs();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a4785daf1e5..405c69655f0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -99,10 +99,12 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemRegisteredSize());
+
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
+
+ /* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
@@ -218,6 +220,10 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
/* Initialize subsystems */
CreateOrAttachShmemStructs();
@@ -231,6 +237,22 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterShmemStructs(void)
+{
+ /*
+ * TODO: Not used in any built-in subsystems yet. In the future, most of
+ * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
+ * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
+ * into the subsystems from here.
+ */
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
@@ -249,16 +271,26 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Now initialize LWLocks, which do shared memory allocation and are
- * needed for InitShmemIndex.
- */
- CreateLWLocks();
-
- /*
- * Set up shmem.c index hashtable
- */
- InitShmemIndex();
+#ifdef EXEC_BACKEND
+ if (IsUnderPostmaster)
+ {
+ /*
+ * ShmemAttachRegistered() uses LWLocks. Fortunately, LWLocks don't
+ * need any special attaching.
+ */
+ ShmemAttachRegistered();
+ }
+ else
+#endif
+ {
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function
+ * use LWLocks. (Nothing else can be running during startup, so they
+ * don't need to do any locking yet, but we nevertheless allow it.)
+ */
+ CreateLWLocks();
+ ShmemInitRegistered();
+ }
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 55e4a5421de..e427a45705c 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,48 +19,95 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size, but
- * their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted. However,
- * if one hash table grows very large and then shrinks, its space
- * cannot be redistributed to other tables. We could build a simple
- * hash bucket garbage collector if need be. Right now, it seems
- * unnecessary.
+ * Nowadays, there is also another way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by this facility and dynamic
+ * shared memory is that traditional shared memory areas are mapped to the
+ * same address in all processes, so you can use normal pointers in shared
+ * memory structs. With Dynamic Shared Memory, you must use offsets or DSA
+ * pointers instead.
+ *
+ * There are two kinds of shared memory data structures: fixed-size structures
+ * and hash tables. Fixed-size structures contain things like global
+ * variables for a module and should never be allocated after the shared
+ * memory initialization phase. Hash tables have a fixed maximum size, but
+ * their actual size can vary dynamically. When entries are added to the
+ * table, more space is allocated. Each shared data structure and hash has a
+ * string name to identify it, specified in the descriptor when its
+ * registered.
+ *
+ * Shared memory can never be freed, once allocated. Each hash table has its
+ * own free list, so hash buckets can be reused when an item is deleted.
+ * However, if one hash table grows very large and then shrinks, its space
+ * cannot be redistributed to other tables. We could build a simple hash
+ * bucket garbage collector if need be. Right now, it seems unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate a shared memory area, fill in the name, size, and any other
+ * options in ShmemStructDesc, and call ShmemRegisterStruct(). Leave any
+ * unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_init(void *arg);
+ *
+ * static ShmemStructDesc MyShmemDesc = {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .init_fn = my_shmem_init,
+ * .ptr = &MyShmem,
+ * };
+ *
+ * In the subsystem's initialization code (or in _PG_init() in extensions),
+ * call ShmemRegisterStruct(&MyShmemDesc).
+ *
+ * Lifecycle
+ * ---------
+ *
+ * RegisterShmemStructs() is called at postmaster startup before calculating
+ * the size of the global shared memory segment. Once all the registrations
+ * have been done, postmaster calls ShmemRegisteredSize() to add up the sizes
+ * of all the registered areas. After allocating the shared memory segment,
+ * postmaster calls ShmemInitRegistered(), which calls the init_fn callbacks
+ * of each registered area, in the order that they were registered.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls RegisterShmemStructs(), followed by
+ * ShmemAttachRegistered(), which re-establishes the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registring shmem areas.
+ * It pre-dates the ShmemRegisterStruct()/ShmemRegisterHash() functions, and
+ * should not be used in new code, but as of this writing it is still widely
+ * used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -76,6 +123,24 @@
#include "storage/spin.h"
#include "utils/builtins.h"
+/*
+ * Array of registered shared memory areas.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork(). Extensions may register additional
+ * areas after startup, but only areas registered at postmaster startup are
+ * included in the estimate for the total memory needed for shared memory. If
+ * any non-trivial allocations are made after startup, there might not be
+ * enough shared memory available.
+ */
+static ShmemStructDesc **registered_shmem_areas;
+static int num_registered_shmem_areas = 0;
+static int max_registered_shmem_areas = 0; /* allocated size of the array */
+
+/* estimated size of registered_shmem_areas (not a hard limit) */
+#define INITIAL_REGISTRY_SIZE (64)
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -95,6 +160,9 @@ typedef struct ShmemAllocatorData
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
/* shared memory global variables */
static PGShmemHeader *ShmemSegHdr; /* shared mem segment header */
@@ -103,24 +171,345 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+static bool shmem_initialized = false;
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for allocations
+ * after postmaster startup (not a hard limit)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
+/*
+ * ShmemRegisterStruct() --- register a shared memory struct
+ *
+ * Subsystems call this to register their shared memory needs. That should be
+ * done early in postmaster startup, before the shared memory segment has been
+ * created, so that the size can be included in the estimate for total amount
+ * of shared memory needed. We set aside a small amount of memory for
+ * allocations that happen later, for the benefit of non-preloaded extensions,
+ * but that should not be relied upon.
+ *
+ * In core subsystems, each subsystem's registration function is called from
+ * RegisterShmemStructs(). In extensions, this should be called from the
+ * _PG_init() function. In EXEC_BACKEND mode, this also needs to be called in
+ * each child process, to reattach and set the pointer to the shared memory
+ * area, usually in a global variable. Calling this from the _PG_init()
+ * initializer takes care of that too.
+ *
+ * When called during postmaster startup, before the shared memory has been
+ * allocated, the function merely remembers the registered descriptor, but the
+ * descriptor may still be changed later, until the shared memory segment has
+ * been allocated. That means that an extension may still modify the
+ * already-registered descriptor in the shmem_request_hook. A common example
+ * of when that's useful is when the size depends on MaxBackends: you can
+ * leave the size empty in the ShmemRegisterStruct() call and fill it later in
+ * the shmem_request_hook.
+ *
+ * Returns true if the struct was already initialized in shared memory and we
+ * merely attached to it.
+ */
+bool
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+ bool found;
+
+ /* Check that it's not already registered in this process */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *existing = registered_shmem_areas[i];
+
+ if (strcmp(existing->name, desc->name) == 0)
+ elog(ERROR, "shared memory struct \"%s\" is already registered",
+ desc->name);
+ }
+
+ /* desc->ptr can be non-NULL when re-initializing after crash */
+ if (!IsUnderPostmaster && desc->ptr)
+ *desc->ptr = NULL;
+
+ /* Add the descriptor to the array, growing the array if needed */
+ if (num_registered_shmem_areas == max_registered_shmem_areas)
+ {
+ int new_size;
+
+ if (registered_shmem_areas)
+ {
+ new_size = max_registered_shmem_areas * 2;
+ registered_shmem_areas = repalloc(registered_shmem_areas,
+ new_size * sizeof(ShmemStructDesc *));
+ }
+ else
+ {
+ new_size = INITIAL_REGISTRY_SIZE;
+ registered_shmem_areas = MemoryContextAlloc(TopMemoryContext,
+ new_size * sizeof(ShmemStructDesc *));
+ }
+ max_registered_shmem_areas = new_size;
+ }
+ registered_shmem_areas[num_registered_shmem_areas++] = desc;
+
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ if (shmem_initialized)
+ {
+ ShmemIndexEnt *index_entry;
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (index_entry->size != desc->size)
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+ else
+ {
+ /*
+ * This is the first time. Initialize it like
+ * ShmemInitRegistered() would
+ */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+
+ /*
+ * XXX: if this errors out, the area is left in a half-initialized
+ * state
+ */
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+ }
+ else
+ found = false;
+
+ return found;
+}
+
+/*
+ * ShmemRegisteredSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemRegisteredSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the registered areas */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+
+ size = add_size(size, desc->size);
+ size = add_size(size, desc->extra_size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRegistered() --- allocate and initialize pre-registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRegistered(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(!shmem_initialized);
+
+ /*
+ * Initialize all the registered memory areas. There are no concurrent
+ * processes yet, so no need for locking.
+ */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+ size_t allocated_size;
+ void *structPtr;
+ bool found;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ elog(ERROR, "shmem struct \"%s\" is already initialized", desc->name);
+
+ /* allocate and initialize it */
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ *(desc->ptr) = structPtr;
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ shmem_initialized = true;
+}
+
+/*
+ * Call the attach_fn callbacks of all registered shmem areas
+ *
+ * This is called at backend startup, in EXEC_BACKEND mode.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i];
+ bool found;
+ ShmemIndexEnt *result;
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_FIND, &found);
+ if (!found)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+
+ if (desc->ptr)
+ *desc->ptr = result->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_initialized = true;
+}
+#endif
+
+/*
+ * Reset the shmem struct registry on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ shmem_initialized = false;
+ num_registered_shmem_areas = 0;
+
+ /* FIXME: this leaks the allocations in TopMemoryContext */
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
+ Size offset;
+ int hash_size;
+ HASHCTL info;
+ int hash_flags;
+ size_t size;
+
+ Assert(!shmem_initialized);
Assert(seghdr != NULL);
/*
@@ -134,41 +523,47 @@ InitShmemAllocator(PGShmemHeader *seghdr)
ShmemBase = seghdr;
ShmemEnd = (char *) ShmemBase + seghdr->totalsize;
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
- if (IsUnderPostmaster)
- {
- PGShmemHeader *shmhdr = ShmemSegHdr;
-
- ShmemAllocator = (ShmemAllocatorData *) ((char *) shmhdr + shmhdr->content_offset);
- ShmemLock = &ShmemAllocator->shmem_lock;
- }
- else
- {
- Size offset;
-
- /*
- * Allocations after this point should go through ShmemAlloc, which
- * expects to allocate everything on cache line boundaries. Make sure
- * the first allocation begins on a cache line boundary.
- */
- offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
- if (offset > seghdr->totalsize)
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("out of shared memory (%zu bytes requested)",
- offset)));
+ /*
+ * Allocations after this point should go through ShmemAlloc, which
+ * expects to allocate everything on cache line boundaries. Make sure the
+ * first allocation begins on a cache line boundary.
+ */
+ offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
+ if (offset > seghdr->totalsize)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of shared memory (%zu bytes requested)",
+ offset)));
- ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ ShmemLock = &ShmemAllocator->shmem_lock;
+ if (!IsUnderPostmaster)
+ {
SpinLockInit(&ShmemAllocator->shmem_lock);
- ShmemLock = &ShmemAllocator->shmem_lock;
ShmemAllocator->free_offset = offset;
- /* ShmemIndex can't be set up yet (need LWLocks first) */
- ShmemAllocator->index = NULL;
- ShmemIndex = (HTAB *) NULL;
}
+
+ /*
+ * Create (or attach to) the shared memory index of shmem areas.
+ */
+ hash_size = num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE;
+
+ info.keysize = SHMEM_INDEX_KEYSIZE;
+ info.entrysize = sizeof(ShmemIndexEnt);
+ info.dsize = info.max_dsize = hash_select_dirsize(hash_size);
+ info.alloc = ShmemAllocNoError;
+ hash_flags = HASH_ELEM | HASH_STRINGS | HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ if (!IsUnderPostmaster)
+ {
+ size = hash_get_shared_size(&info, hash_flags);
+ ShmemAllocator->index = (HASHHDR *) ShmemAlloc(size);
+ }
+ else
+ hash_flags |= HASH_ATTACH;
+ info.hctl = ShmemAllocator->index;
+ ShmemIndex = hash_create("ShmemIndex", hash_size, &info, hash_flags);
+ Assert(ShmemIndex != NULL);
}
/*
@@ -268,67 +663,14 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * InitShmemIndex() --- set up or attach to shmem index table.
- */
-void
-InitShmemIndex(void)
-{
- HASHCTL info;
-
- /*
- * Create the shared memory shmem index.
- *
- * Since ShmemInitHash calls ShmemInitStruct, which expects the ShmemIndex
- * hashtable to exist already, we have a bit of a circularity problem in
- * initializing the ShmemIndex itself. The special "ShmemIndex" hash
- * table name will tell ShmemInitStruct to fake it.
- */
- info.keysize = SHMEM_INDEX_KEYSIZE;
- info.entrysize = sizeof(ShmemIndexEnt);
-
- ShmemIndex = ShmemInitHash("ShmemIndex",
- SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
-}
-
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * max_size is the estimated maximum number of hashtable entries. This is
- * not a hard limit, but the access efficiency will degrade if it is
- * exceeded substantially (since it's used to compute directory size and
- * the hash table buckets will get overfull).
+ * ShmemRegisterHash -- Register a shared memory hash table.
*
- * init_size is the number of hashtable entries to preallocate. For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Similar to ShmemRegisterStruct(), but registers a hash table instead of an
+ * opaque area.
*/
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 init_size, /* initial table size */
- int64 max_size, /* max size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
+bool
+ShmemRegisterHash(ShmemHashDesc *desc)
{
- bool found;
- void *location;
-
/*
* Hash tables allocated in shared memory have a fixed directory; it can't
* grow or other backends wouldn't be able to find it. So, make sure we
@@ -336,145 +678,58 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
*
* The shared memory allocator must be specified too.
*/
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemAllocNoError;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ desc->hash_info.dsize = desc->hash_info.max_dsize = hash_select_dirsize(desc->max_size);
+ desc->hash_info.alloc = ShmemAllocNoError;
+ desc->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
/* look it up in the shmem index */
- location = ShmemInitStruct(name,
- hash_get_shared_size(infoP, hash_flags),
- &found);
+ memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+ desc->base_desc.name = desc->name;
+ desc->base_desc.size = hash_get_shared_size(&desc->hash_info, desc->hash_flags);
+ desc->base_desc.init_fn = shmem_hash_init;
+ desc->base_desc.init_fn_arg = desc;
+ desc->base_desc.attach_fn = shmem_hash_attach;
+ desc->base_desc.attach_fn_arg = desc;
/*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
+ * We need a stable pointer to hold the pointer to the shared memory. Use
+ * the one passed in the descriptor now. It will be replaced with the
+ * hash table header by init or attach function.
*/
- if (found)
- hash_flags |= HASH_ATTACH;
+ desc->base_desc.ptr = (void **) desc->ptr;
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
+ desc->base_desc.extra_size = hash_estimate_size(desc->max_size, desc->hash_info.entrysize) - desc->base_desc.size;
- return hash_create(name, init_size, infoP, hash_flags);
+ return ShmemRegisterStruct(&desc->base_desc);
}
-/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
- *
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
- *
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+static void
+shmem_hash_init(void *arg)
{
- ShmemIndexEnt *result;
- void *structPtr;
-
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
-
- if (!ShmemIndex)
- {
- /* Must be trying to create/attach to ShmemIndex itself */
- Assert(strcmp(name, "ShmemIndex") == 0);
-
- if (IsUnderPostmaster)
- {
- /* Must be initializing a (non-standalone) backend */
- Assert(ShmemAllocator->index != NULL);
- structPtr = ShmemAllocator->index;
- *foundPtr = true;
- }
- else
- {
- /*
- * If the shmem index doesn't exist, we are bootstrapping: we must
- * be trying to init the shmem index itself.
- *
- * Notice that the ShmemIndexLock is released before the shmem
- * index has been initialized. This should be OK because no other
- * process can be accessing shared memory yet.
- */
- Assert(ShmemAllocator->index == NULL);
- structPtr = ShmemAlloc(size);
- ShmemAllocator->index = structPtr;
- *foundPtr = false;
- }
- LWLockRelease(ShmemIndexLock);
- return structPtr;
- }
-
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
-
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
- if (*foundPtr)
- {
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
+ /* Pass location of hashtable header to hash_create */
+ desc->hash_info.hctl = (HASHHDR *) *desc->base_desc.ptr;
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
- }
+ *desc->ptr = hash_create(desc->name, desc->init_size, &desc->hash_info, hash_flags);
+}
- LWLockRelease(ShmemIndexLock);
+static void
+shmem_hash_attach(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
- Assert(ShmemAddrIsValid(structPtr));
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ /* Pass location of hashtable header to hash_create */
+ desc->hash_info.hctl = (HASHHDR *) *desc->base_desc.ptr;
- return structPtr;
+ *desc->ptr = hash_create(desc->name, desc->init_size, &desc->hash_info, hash_flags);
}
-
/*
* Add two Size values, checking for overflow
*/
@@ -761,3 +1016,85 @@ pg_numa_available(PG_FUNCTION_ARGS)
{
PG_RETURN_BOOL(pg_numa_init() != -1);
}
+
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ ShmemStructDesc *desc;
+
+ Assert(shmem_initialized);
+
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemStructDesc) + sizeof(void *));
+ desc->name = name;
+ desc->size = size;
+ desc->ptr = (void *) (((char *) desc) + sizeof(ShmemStructDesc));
+
+ *foundPtr = ShmemRegisterStruct(desc);
+ Assert(*desc->ptr != NULL);
+ return *desc->ptr;
+}
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterHash() in new code!
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ ShmemHashDesc *desc;
+
+ Assert(shmem_initialized);
+
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemHashDesc) + sizeof(HTAB *));
+ desc->name = name;
+ desc->init_size = init_size;
+ desc->max_size = max_size;
+ memcpy(&desc->hash_info, infoP, sizeof(HASHCTL));
+ desc->hash_flags = hash_flags;
+
+ desc->ptr = (HTAB **) (((char *) desc) + sizeof(ShmemHashDesc));
+
+ ShmemRegisterHash(desc);
+ return *desc->ptr;
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..fa074c419a8 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4162,6 +4162,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* Initialize size of fast-path lock cache. */
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterShmemStructs(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..4d6d67deca8 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,19 +24,146 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * ShmemStructDesc describes a named area or struct in shared memory.
+ *
+ * Shared memory is reserved and allocated in a few phases at postmaster
+ * startup, and in EXEC_BACKEND mode, there's some extra work done to "attach"
+ * to them at backend startup. ShmemStructDesc contains all the information
+ * needed to manage the lifecycle.
+ *
+ * 'name', 'size' and the callback functions are filled in by the
+ * ShmemRegisterStruct() caller. After registration, the shmem machinery
+ * reserves memory for the area, sets *ptr to point to the allocation, and
+ * calls the callbacks at the right moments.
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ /* Size of the shared memory area */
+ size_t size;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup. 'init_fn_arg'
+ * is an opaque argument passed to the callback.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup. We
+ * never do that in core code, but extensions might.
+ */
+ ShmemInitCallback attach_fn;
+ void *attach_fn_arg;
+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for a named shared memory hash table.
+ *
+ * Similar to ShmemStructDesc, but describes a shared memory hash table. Each
+ * hash table is backed by an allocated area, described by 'base_desc', but if
+ * 'max_size' is greater than 'init_size', it can also grow beyond the initial
+ * allocated area by allocating more hash entries from the global unreserved
+ * space.
+ */
+typedef struct ShmemHashDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRegisterHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+
+ /*
+ * Descriptor for the underlying "area". Callers of ShmemRegisterHash()
+ * do not need to touch this, it is filled in by ShmemRegisterHash() based
+ * on the hash table parameters.
+ */
+ ShmemStructDesc base_desc;
+} ShmemHashDesc;
/* shmem.c */
extern PGDLLIMPORT slock_t *ShmemLock;
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
-extern void InitShmemIndex(void);
+
+extern bool ShmemRegisterStruct(ShmemStructDesc *desc);
+extern bool ShmemRegisterHash(ShmemHashDesc *desc);
+
+/* legacy shmem allocation functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRegistered(void);
+#endif
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
@@ -45,19 +172,4 @@ extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* estimated size of the shmem index table (not a hard limit) */
-#define SHMEM_INDEX_SIZE (64)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 49ad84a62d4..990d506a696 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2815,9 +2815,11 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
ShmemIndexEnt
+ShmemHashDesc
+ShmemStructDesc
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v4-0002-Convert-pg_stat_statements-to-use-the-new-interfa.patch (10.1K, 3-v4-0002-Convert-pg_stat_statements-to-use-the-new-interfa.patch)
download | inline diff:
From ffa94e71e7933444dd7fbdadd35ebf6c415d7d5d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 23:00:00 +0200
Subject: [PATCH v4 2/3] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
---
.../pg_stat_statements/pg_stat_statements.c | 161 ++++++++----------
1 file changed, 67 insertions(+), 94 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 4a427533bd8..cc18df34e70 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -248,7 +248,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -258,13 +258,39 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc =
+{
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .init_fn = pgss_shmem_init,
+ .ptr = (void **) &pgss,
+};
+
+static ShmemHashDesc pgssSharedHashDesc =
+{
+ .name = "pg_stat_statements hash",
+ .ptr = &pgss_hash,
+
+ .init_size = 0, /* set from 'pgss_max' */
+ .max_size = 0, /* set from 'pgss_max' */
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
+
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
@@ -274,10 +300,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -330,7 +352,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
@@ -365,7 +386,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -470,11 +490,18 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs, including hash table
+ */
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+ pgssSharedHashDesc.init_size = pgss_max;
+ pgssSharedHashDesc.max_size = pgss_max;
+ ShmemRegisterHash(&pgssSharedHashDesc);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
@@ -493,31 +520,31 @@ _PG_init(void)
ProcessUtility_hook = pgss_ProcessUtility;
}
-/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
- */
static void
-pgss_shmem_request(void)
+pgss_shmem_init(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ int tranche_id;
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem_startup hook: Load any pre-existing statistics from file at
+ * postmaster startup. Also create and load the query-texts file, which is
+ * expected to exist (even if empty) while the module is enabled.
*/
static void
pgss_shmem_startup(void)
{
- bool found;
- HASHCTL info;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -530,54 +557,14 @@ pgss_shmem_startup(void)
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
+ if (IsUnderPostmaster)
+ return; /* nothing to do in backends */
/*
- * Create or attach to the shared memory state, including hash table
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
-
- /*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
- */
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
-
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
* Note: we don't bother with locks here, because there should be no other
@@ -1337,7 +1324,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1358,11 +1345,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1377,8 +1364,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1517,7 +1504,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1806,7 +1793,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2043,7 +2030,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
free(qbuffer);
}
@@ -2082,20 +2069,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2725,7 +2698,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2819,7 +2792,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v4-0003-Use-the-new-mechanism-in-a-few-core-subsystems.patch (39.5K, 4-v4-0003-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From 1f0d7535e5c1e07dddaa5f0c093ece59716a3fd4 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 23:01:44 +0200
Subject: [PATCH v4 3/3] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
---
src/backend/access/transam/varsup.c | 33 ++---
src/backend/storage/ipc/dsm.c | 45 +++---
src/backend/storage/ipc/dsm_registry.c | 34 ++---
src/backend/storage/ipc/ipci.c | 45 +++---
src/backend/storage/ipc/pmsignal.c | 49 ++++---
src/backend/storage/ipc/procarray.c | 122 ++++++++--------
src/backend/storage/ipc/procsignal.c | 63 ++++-----
src/backend/storage/ipc/sinvaladt.c | 36 ++---
src/backend/storage/lmgr/proc.c | 178 ++++++++++++------------
src/backend/utils/activity/wait_event.c | 94 ++++++-------
src/include/access/transam.h | 3 +-
src/include/storage/dsm_registry.h | 3 +-
src/include/storage/pmsignal.h | 3 +-
src/include/storage/proc.h | 2 +-
src/include/storage/procarray.h | 3 +-
src/include/storage/procsignal.h | 3 +-
src/include/storage/sinvaladt.h | 3 +-
src/include/utils/wait_event.h | 3 +-
18 files changed, 360 insertions(+), 362 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..84e6c90f4fa 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,32 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemInit(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+static ShmemStructDesc TransamVariablesShmemDesc = {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .init_fn = VarsupShmemInit,
+ .ptr = (void **) &TransamVariables,
+};
/*
* Initialization of shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
{
- return sizeof(TransamVariablesData);
+ ShmemRegisterStruct(&TransamVariablesShmemDesc);
}
-void
-VarsupShmemInit(void)
-{
- bool found;
+static void
+VarsupShmemInit(void *arg)
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+{
+ memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..f7c18eaf385 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -110,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_init(void *);
+
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+ .name = "Preallocated DSM",
+ .size = 0, /* calculated later */
+ .init_fn = dsm_main_space_init,
+ .ptr = &dsm_main_space_begin,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -479,27 +488,29 @@ void
dsm_shmem_init(void)
{
size_t size = dsm_estimate_size();
- bool found;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
+
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 068c1577b12..60c84471221 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -56,6 +56,16 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryCtxShmemInit(void *arg);
+
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .init_fn = DSMRegistryCtxShmemInit,
+ .ptr = (void **) &DSMRegistryCtx,
+};
+
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -113,27 +123,17 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 405c69655f0..cc32a292b17 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -102,15 +102,14 @@ CalculateShmemSize(void)
size = add_size(size, ShmemRegisteredSize());
size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
+
+ size = add_size(size, ShmemRegisteredSize());
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -120,11 +119,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -138,7 +133,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
@@ -246,11 +240,25 @@ void
RegisterShmemStructs(void)
{
/*
- * TODO: Not used in any built-in subsystems yet. In the future, most of
- * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
- * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
- * into the subsystems from here.
+ * TODO: In the future, all the calls *ShmemInit() calls in
+ * CreateOrAttachShmemStructs(), and all the *ShmemSize() calls in
+ * CalculateShmemSize() will be replaced by calls into the subsystems from
+ * here.
+ */
+
+ /*
+ * Note: there are some inter-dependencies between these, so the order of
+ * some of these matter.
*/
+ DSMRegistryShmemRegister();
+ ProcGlobalShmemRegister();
+ VarsupShmemRegister();
+ ProcArrayShmemRegister();
+ SharedInvalShmemRegister();
+ PMSignalShmemRegister();
+ ProcSignalShmemRegister();
+
+ WaitEventCustomShmemRegister();
}
/*
@@ -293,12 +301,10 @@ CreateOrAttachShmemStructs(void)
}
dsm_shmem_init();
- DSMRegistryShmemInit();
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -321,23 +327,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
@@ -356,7 +352,6 @@ CreateOrAttachShmemStructs(void)
SyncScanShmemInit();
AsyncShmemInit();
StatsShmemInit();
- WaitEventCustomShmemInit();
InjectionPointShmemInit();
AioShmemInit();
WaitLSNShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..d7463df73ab 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -83,6 +83,15 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemInit(void *);
+
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* calculated later */
+ .init_fn = PMSignalShmemInit,
+ .ptr = (void **) &PMSignalState,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
{
Size size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = offsetof(PMSignalData, PMChildFlags);
+ size = add_size(size, mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ PMSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&PMSignalShmemDesc);
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ /* initialize all flags to zeroes */
+ Assert(PMSignalState);
+ MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 0f913897acc..5daf33d5323 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -103,6 +103,19 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+ .name = "Proc Array",
+ .size = 0, /* calculated later */
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+ .ptr = (void **) &procArray,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,25 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+ .name = "KnownAssignedXids",
+ .size = 0, /* calculated later */
+ .init_fn = NULL, /* no initialization needed */
+ .ptr = (void **) &KnownAssignedXids,
+};
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+ .name = "KnownAssignedXidsValid",
+ .size = 0, /* calculated later */
+ .init_fn = NULL, /* no initialization needed */
+ .ptr = (void **) &KnownAssignedXidsValid,
+};
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,18 +401,18 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ /* Register the ProcArray shared structure */
+ ProcArrayShmemDesc.size =
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ ShmemRegisterStruct(&ProcArrayShmemDesc);
/*
* During Hot Standby processing we have a data structure called
@@ -405,64 +432,41 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ KnownAssignedXidsShmemDesc.size =
+ mul_size(sizeof(TransactionId),
+ TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+ KnownAssignedXidsValidShmemDesc.size =
+ mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
}
-
- return size;
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..a0c94ddd77c 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -105,7 +105,18 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemInit(void *arg);
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+ .name = "ProcSignal",
+ .size = 0, /* calculated later */
+ .init_fn = ProcSignalShmemInit,
+ .ptr = (void **) &ProcSignal,
+};
+
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -113,51 +124,37 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ProcSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcSignalShmemDesc);
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..b9df3b84de9 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,8 +203,17 @@ typedef struct SISeg
*/
#define NumProcStateSlots (MaxBackends + NUM_AUXILIARY_PROCS)
+static void SharedInvalShmemInit(void *arg);
+
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static ShmemStructDesc SharedInvalShmemDesc = {
+ .name = "shmInvalBuffer",
+ .size = 0, /* calculated later */
+ .init_fn = SharedInvalShmemInit,
+ .ptr = (void **) &shmInvalBuffer,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
{
Size size;
@@ -223,26 +233,16 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ SharedInvalShmemDesc.size = size;
+ ShmemRegisterStruct(&SharedInvalShmemDesc);
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index d407725e602..961fcb2bc67 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -69,9 +69,41 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemInit(void *arg);
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .init_fn = ProcGlobalShmemInit,
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRegistered()
+ * has been called.
+ */
+ .ptr = (void **) &ProcGlobal,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+ .name = "PGPROC structures",
+ .size = 0, /* calculated later */
+ .ptr = (void **) &tmpAllProcs,
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+ .name = "Fast-Path Lock Array",
+ .size = 0, /* calculated later */
+ .ptr = (void **) &tmpFastPathLockArray,
+};
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -81,24 +113,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -106,8 +120,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -123,25 +135,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -156,7 +149,50 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRegister -
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+void
+ProcGlobalShmemRegister(void)
+{
+ Size size = 0;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+ ProcGlobalAllProcsShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+ FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+ ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+ /*
+ * Register the ProcGlobal shared structure last. Its init callback
+ * initializes the others too.
+ */
+ ShmemRegisterStruct(&ProcGlobalShmemDesc);
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -175,36 +211,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
-
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -217,23 +240,12 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -242,7 +254,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -257,31 +269,25 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index aca2c8fc742..d90293fe697 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -79,6 +79,30 @@ typedef struct WaitEventCustomEntryByName
uint32 wait_event_info;
} WaitEventCustomEntryByName;
+static ShmemHashDesc WaitEventCustomHashByInfoDesc =
+{
+ .name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
+
+static ShmemHashDesc WaitEventCustomHashByNameDesc =
+{
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
/* dynamic allocation counter for custom wait events */
typedef struct WaitEventCustomCounterData
@@ -90,6 +114,16 @@ typedef struct WaitEventCustomCounterData
/* pointer to the shared memory */
static WaitEventCustomCounterData *WaitEventCustomCounter;
+static void WaitEventCustomCounterDataShmemInit(void *arg);
+
+static ShmemStructDesc WaitEventCustomCounterShmemDesc =
+{
+ .name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .init_fn = WaitEventCustomCounterDataShmemInit,
+ .ptr = (void **) &WaitEventCustomCounter,
+};
+
/* first event ID of custom wait events */
#define WAIT_EVENT_CUSTOM_INITIAL_ID 1
@@ -97,60 +131,22 @@ static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+void
+WaitEventCustomShmemRegister(void)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRegisterStruct(&WaitEventCustomCounterShmemDesc);
+ ShmemRegisterHash(&WaitEventCustomHashByInfoDesc);
+ ShmemRegisterHash(&WaitEventCustomHashByNameDesc);
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomCounterDataShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..2dfc8b0f85f 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 3f89450c216..1d1e0881af2 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,7 +552,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index c5ab1574fe3..572516c4e21 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..d2344b1cbb3 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -63,8 +63,7 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86fc8637f5e 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,7 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
+extern void WaitEventCustomShmemRegister(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-13 23:02 Zsolt Parragi <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Zsolt Parragi @ 2026-03-13 23:02 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Hello
dsm_shmem_init(void)
{
size_t size = dsm_estimate_size();
- bool found;
if (size == 0)
return;
Isn't there an assignment missing from this function now? Size is
calculated but never used. With the current code if
min_dynamic_shared_memory > 0 the server can crash.
+static ShmemHashDesc WaitEventCustomHashByNameDesc =
+{
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
This was HASH_STRINGS originally, and it is used with plain const
char* parameters. Shouldn't it use HASH_SRINGS as before?
size = add_size(size, ShmemRegisteredSize());
size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
+
+ size = add_size(size, ShmemRegisteredSize());
ShmemRegisteredSize is now called twice.
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock);
Second parameter is missing for LWLockInitialize
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-16 09:37 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-16 09:37 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Mar 13, 2026 at 5:11 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 12/03/2026 22:05, Robert Haas wrote:
> > On Thu, Mar 12, 2026 at 3:21 PM Heikki Linnakangas <[email protected]> wrote:
> >>>> I'm currently leaning towards _PG_init(), except for allocations that
> >>>> depend on MaxBackends. For those, you can install a shmem_request_hook
> >>>> that sets the size in the descriptor. In other words, you can leave the
> >>>> 'size' as empty in _PG_init(), but set it later in the shmem_request_hook.
> >>>
> >>> Why can't you just do the whole thing later?
> >>
> >> shmem_request_hook won't work in EXEC_BACKEND mode, because in
> >> EXEC_BACKEND mode, ShmemRegisterStruct() also needs to be called at
> >> backend startup.
> >>
> >> One of my design goals is to avoid EXEC_BACKEND specific steps so that
> >> if you write your extension oblivious to EXEC_BACKEND mode, it will
> >> still usually work with EXEC_BACKEND. For example, if it was necessary
> >> to call a separate AttachShmem() function for every shmem struct in
> >> EXEC_BACKEND mode, but which was not needed on Unix, that would be bad.
> >
> > That's *definitely* a good goal. A less important but still valuable
> > goal is to maximize the notational simplicity of the mechanism. Your
> > callback idea is elegant in theory but in practice it seems like it
> > might make it harder for people to get started quickly on a new
> > module, and having to create the object in one place and then fill in
> > the size in another sort of has the same problem. I don't really know
> > what to do about that, but it's something to think about. The
> > complexity of getting the details right is annoyingly high in this
> > area.
>
> Yeah. IMHO the existing shmem_request/startup_hook mechanism is pretty
> awkward too, and in most cases, the new mechanism is more convenient. It
> might be slightly less convenient for some things, but mostly it's
> better. Would you agree with that, or do you actually like the old hooks
> and ShmemInitStruct() better?
FWIW, I like the new way.
If your goal is to get rid of ShmemInitStruct, ShmemInitHash,
shmem_request/startup_hook, we could replace the hooks by another hook
which gets called after MaxBackends is initialized but before
CalculateShmemSize() gets called. All StructShmemRegister() can be
called in that hook. _PG_init() is used to register the hook as it's
done today. If we are going to deprecate the old hooks in a couple of
releases, we will need to maintain three hooks but given that we don't
have to wait for several releases for the extensions to adapt to the
new hooks, it should be only a temporary measure.
>
> One such wrinkle with ShmemRegisterStruct() in the patch now is that
> it's harder to do initialization that touches multiple structs or hash
> tables. Currently the callbacks are called in the same order that the
> structs are registered, so you can do all the initialization in the last
> struct's callback. The single pair of shmem_request/startup_hooks per
> module was more clear in that aspect. Fortunately, that kind of
> cross-struct dependencies are pretty rare. So I think it's fine. (The
> order that the callbacks are called needs be documented explicitly though).
>
> If we want to improve on that, one idea would be to introduce a
> ShmemRegisterCallbacks() function to register callbacks that are not
> tied to any particular struct and are called after all the per-struct
> callbacks.
>
I think as long as we stick to calling the init/attach functions in
the order of their registration (or any well defined suitable order),
the module can use perform initization/setup of individual structures
to which the callback is attached and also modify/use previously
registered structures or do the setup in the last registered
structure's init. I think this is more flexible that required; maybe
overwhelmingly flexible.
> >>> Yeah, I think RequestNamedLWLockTranche() might be fine if you just
> >>> need LWLocks, but if you need a bunch of resources, putting them all
> >>> into the same chunk of memory seems cleaner.
> >>
> >> Agreed. Then again, how often do you need just a LWLock (or multiple
> >> LWLocks)? Surely you have a struct you want to protect with the lock. I
> >> guess having shmem hash table but no struct would be pretty common, though.
> >
> > Yeah, we've developed an annoying number of different ways to do this
> > stuff. I don't entirely know how to fix that.
>
> Here's a new version that doubles down on the
> LWLockNewTrancheId+LWLockInitialize method, by changing the example in
> the docs, and contrib/pg_stat_statements, to use that method.
> RequestNamedLWLockTranche() still works, there are no changes to it,
> it's just not as convenient to use with ShmemRegisterStruct(). This has
> the advantage that we don't introduce yet another way of allocating LWLocks.
The new hook could be used to request LWLockTranche as well.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-16 10:28 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-16 10:28 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, Mar 14, 2026 at 2:39 AM Heikki Linnakangas <[email protected]> wrote:
>
> On 06/03/2026 16:12, Heikki Linnakangas wrote:
> > Firstly, I'm not sure what to do with ShmemRegisterHash() and the
> > 'HASHCTL *infoP' argument to it. I feel it'd be nicer if the HASHCTL was
> > just part of the ShmemHashDesc struct, but I'm not sure if that fits all
> > the callers. I'll have to try that out I guess.
> I took a stab at that, and it turned out to be straightforward. I'm not
> sure why I hesitated on that earlier.
>
Yeah. I wondered about that too.
> Here's a new version with that change, and a ton of little comment
> cleanups and such.
Here are initial comments on these patches.
0001
@@ -3236,6 +3239,8 @@ PostmasterStateMachine(void)
LocalProcessControlFile(true);
/* re-create shared memory and semaphores */
+ ResetShmemAllocator();
This name is misleading. The function does not touch ShmemAllocator at
all. Instead it resets the ShmemStruct registry. I suggest
ResetShememRegistry() instead.
+ *
+ * There are two kinds of shared memory data structures: fixed-size structures
+ * and hash tables.
In future we will have resizable "fixed" structures and we may also
have resizable hash tables i.e. hash tables whose directory would be
resizable. The later would be help support resizable shared buffers
lookup table. It will be good to write the above sentence so that we
can just add more types of data structures without needing to rewrite
everything. If we could find a good term for "fixed-size structures"
which are really "structures that require contiguous memory", we will
be able to write the above sentence as "There are two kinds of shared
memory structures: contiguous structures and hash tables.". When we
add resizable structures, we can just add a sentence "A contiguous
structure may be fixed size or resizable". When we add resizable hash
tables, we can just replace that with "Both of these kinds can be
fixed-size or resizable". I am not sure whether "contiguous
structures" is a good term though (for one, word contiguous can be
confused with continuous). Whatever term we use should be something
that we can carry further in the remaining paragraphs.
+ Fixed-size structures contain things like global
+ * variables for a module and should never be allocated after the shared
+ * memory initialization phase.
I think the existing comment is not accurate. The term "global
variables" in the sentence can be confused with process global
variables. We should be using the term "shared variables" or better
"shared state". If we adopt "contiguous structures" as the term for
the first kind of data structure, we can write "Contiguous structures
contain shared state, maintained in a contiguous chunk of memory, for
a module. It should never be allocated after the shared memory
initialization phase.".
+ * postmaster calls ShmemInitRegistered(), which calls the init_fn callbacks
+ * of each registered area, in the order that they were registered.
... calls the init_fn, if any, of each registered area ....
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemAllocNoError;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ desc->hash_info.dsize = desc->hash_info.max_dsize =
hash_select_dirsize(desc->max_size);
+ desc->hash_info.alloc = ShmemAllocNoError;
+ desc->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
/* look it up in the shmem index */
The next several lines of code look up shmem index. Should we remove
this comments or modify it to say "Register and initialize the hash
table".
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ ShmemHashDesc *desc;
+
... snip ...
+
+ ShmemRegisterHash(desc);
+ return *desc->ptr;
+}
I like the way these functions are written using the new API. I think
we should keep these legacy interface at the end of section of shmem
APIs, rather than keeping those at the end of the file where we have
monitoring and arithmetic functions. If you want to get rid of the
legacy APIs in this release itself, I think it's ok to keep them at
the end of the file.
ShmemInitStruct() now calls ShmemRegisterStruct(). Earlier it could be
called from any backend, in any state to fetch a pointer to a shared
memory structure. It didn't add a new structure. Now it can add a new
structure. I am wondering whether that can cause registry in different
backends to get out of sync. Should we limit the window when it can be
called just like how shmem_request_hook call is limited. In that sense
ShmemRegisterStruct() looks something to be called from a
shmem_register_hook which is also called from EXEC_BACKEND. Sorry to
expand it here rather in my previous reply. In case we replace all the
current calls to ShmemInitStruct() with ShmemRegisterStruct(), we may
be able to get rid of the Shmem Index altogether; after all it's used
only for fetching the pointers to the shared memory areas in
EXEC_BACKEND mode. I thought that we could save the registry in the
shared memory. In EXEC_BACKEND mode, we go over the registry calling
attach_fn for each entry. But since the binary is overwritten in
EXEC_BACKEND case, attach/init fns are not guaranteed to have the same
address in all the backends. Maybe we have to resort to
launch_backend() to transfer the registry to the backend through the
file (to keep it in sycn in all the backends): a solution you may not
like.
+ void **ptr;
+} ShmemStructDesc;
I think the comments for each member should highlight which of these
fields are required (non-zero) and which can be optional (zero'ed
out).
+ */
+ ShmemStructDesc base_desc;
Once we have calculated the base_desc in ShmemRegisterHash() and
called ShmemRegisterStruct(), we don't need base_desc anymore. Even
the pointer to the allocated hash table memory is available through
*ptr. Probably we could just remove this member from here.
ShmemRegisterHash() can declare a variable of type ShmemStructDesc,
populate it based on the members in this structure and pass it to
ShmemRegisterStruct(). I am not comfortable with specification
structure being modified by the registration function.
0003
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
Shouldn't we be setting dsm_main_space_shmem_desc.size here to size
before calling ShmemRegisterStruct()?
@@ -102,15 +102,14 @@ CalculateShmemSize(void)
size = add_size(size, ShmemRegisteredSize());
size = add_size(size, dsm_estimate_size());
We have defined dsm_main_space_shmem_desc, but we still use
dsm_estimate_size() here and initialize the memory in
dsm_shmem_init(), which is explicitily called from
CreateOrAttachShmemStructs(). Why is that? Shouldn't we be registering
the structure in RegisterShmemStructs(), and let ShmemInitRegistered()
initialize it? Am I missing something here?
I will continue to review the patches further.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-17 11:58 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-17 11:58 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Zsolt Parragi <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, Mar 17, 2026 at 3:26 AM Heikki Linnakangas <[email protected]> wrote:
>
>
> I don't plan to get rid of the legacy API any time soon, I expect
> existing extensions to continue using it for years to come. So I moved
> them per your suggestion.
Do you plan to get rid of the shmem_request_hook and
shmem_startup_hook? What's the plan there? Wouldn't the old APIs and
new APIs overwhelm extension writers - having two different ways and
three different hooks to allocate a structure in the shared memory.
>
> > ShmemInitStruct() now calls ShmemRegisterStruct(). Earlier it could be
> > called from any backend, in any state to fetch a pointer to a shared
> > memory structure. It didn't add a new structure. Now it can add a new
> > structure. I am wondering whether that can cause registry in different
> > backends to get out of sync. Should we limit the window when it can be
> > called just like how shmem_request_hook call is limited. In that sense
> > ShmemRegisterStruct() looks something to be called from a
> > shmem_register_hook which is also called from EXEC_BACKEND. Sorry to
> > expand it here rather in my previous reply. In case we replace all the
> > current calls to ShmemInitStruct() with ShmemRegisterStruct(), we may
> > be able to get rid of the Shmem Index altogether; after all it's used
> > only for fetching the pointers to the shared memory areas in
> > EXEC_BACKEND mode.
>
> I think it's still useful to allow ShmemRegisterStruct() after
> postmaster startup, so that you can use it with extensions that are not
> listed in shared_preload_libraries, but need a little bit of shared
> memory. Hmm, perhaps we should add an explicit flag for that case,
> though. So that by default ShmemRegisterStruct() fails if you call it
> after postmaster startup, but you could allow it by setting a flag in
> the descriptor.
>
The hash tables do not allocate all their entries upfront but they
request shared memory for their maximum size. Before they could grow
to the maximum of their size, if somebody calls ShmemInitStruct with a
huge memory size request that takes away all the reserved address
space/memory, the hash table won't get its fair share of shared
memory. I agree that it could still happen if a hash table grows
beyond the contracted request ... but it's atleast registered. I think
we should prevent such cases. We could keep track of the extra shared
memory that we have allocated and refuse an unregistered request if
that memory is exhausted. But I think we already have some unaccounted
for structures e.g. ShmemAllocator and ShmemHeader already. So it
might turn out to be more complex that required. I have feeling that
we are providing flexibility that the current infrastructure can not
support. I am not opposed to supporting ShmemRegisterStruct; I like
the idea, but it seems premature.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-18 19:30 Robert Haas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-18 19:30 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Mar 13, 2026 at 7:41 AM Heikki Linnakangas <[email protected]> wrote:
> Yeah. IMHO the existing shmem_request/startup_hook mechanism is pretty
> awkward too, and in most cases, the new mechanism is more convenient. It
> might be slightly less convenient for some things, but mostly it's
> better. Would you agree with that, or do you actually like the old hooks
> and ShmemInitStruct() better?
I'd say it's not massively different one way or the other. Looking at
the pg_stat_statements changes in particular, I feel like in v2, it
was actually worse, because you didn't get manage to get rid of
pgss_shmem_request(), but it was no longer the thing requesting shmem.
v3 is better in that regard: you get rid of some complexity in
exchange for what you add. It's not amazing, though:
pgss_shmem_startup() now has nothing to do with the name of the hook,
which is not this patch's fault but also isn't solved by it. I wonder
why in the world somebody decided to jam all of this non-shmem-related
logic into this function, and why they didn't add a comment explaining
why it was here and not someplace else. One of the worst things about
this area is that I often end up having to trace through a bunch of
postmaster startup logic to figure out whether any given bit of code
is actually in the right place, and that makes me feel like the hooks
are badly designed. pgss_shmem_startup() is a good example of that,
and the fact that it needs an IsUnderPostmaster check is another.
We should somehow try to make it clear what happens with this new
mechanism if a module is loaded via shared_preload_libraries vs. if
it's loaded via LOAD or session_preload_libraries or whatever. Writing
modules that don't require shared_preload_libraries is a boon to
users, because they can be added to a production system without a
server restart. I wonder whether this new facility handles that case
better, worse, or the same as existing facilities.
> One such wrinkle with ShmemRegisterStruct() in the patch now is that
> it's harder to do initialization that touches multiple structs or hash
> tables. Currently the callbacks are called in the same order that the
> structs are registered, so you can do all the initialization in the last
> struct's callback. The single pair of shmem_request/startup_hooks per
> module was more clear in that aspect. Fortunately, that kind of
> cross-struct dependencies are pretty rare. So I think it's fine. (The
> order that the callbacks are called needs be documented explicitly though).
>
> If we want to improve on that, one idea would be to introduce a
> ShmemRegisterCallbacks() function to register callbacks that are not
> tied to any particular struct and are called after all the per-struct
> callbacks.
I think the whole idea of ShmemInitHash() and ShmemInitStruct() is
relatively poorly designed in this regard. Ideally, I want to
initialize all the shared memory that belongs to me, and that might
include arbitrary data structures, but the current design kind of
thinks that I want one struct or one hash table and nothing else. If
we're redesigning this mechanism, it would sure be nice to improve on
that.
Looking just at the pg_stat_statements changes, I think my overall
view on this right now is that it's not terrible, but I'm also not
that happy about introducing yet another way to do it for this amount
of gain. To me, it doesn't yet rise to the level of a clear win.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-19 10:31 Heikki Linnakangas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-19 10:31 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 18/03/2026 21:30, Robert Haas wrote:
> On Fri, Mar 13, 2026 at 7:41 AM Heikki Linnakangas <[email protected]> wrote:
>> Yeah. IMHO the existing shmem_request/startup_hook mechanism is pretty
>> awkward too, and in most cases, the new mechanism is more convenient. It
>> might be slightly less convenient for some things, but mostly it's
>> better. Would you agree with that, or do you actually like the old hooks
>> and ShmemInitStruct() better?
>
> I'd say it's not massively different one way or the other. Looking at
> the pg_stat_statements changes in particular, I feel like in v2, it
> was actually worse, because you didn't get manage to get rid of
> pgss_shmem_request(), but it was no longer the thing requesting shmem.
> v3 is better in that regard: you get rid of some complexity in
> exchange for what you add. It's not amazing, though:
> pgss_shmem_startup() now has nothing to do with the name of the hook,
> which is not this patch's fault but also isn't solved by it. I wonder
> why in the world somebody decided to jam all of this non-shmem-related
> logic into this function, and why they didn't add a comment explaining
> why it was here and not someplace else. One of the worst things about
> this area is that I often end up having to trace through a bunch of
> postmaster startup logic to figure out whether any given bit of code
> is actually in the right place, and that makes me feel like the hooks
> are badly designed. pgss_shmem_startup() is a good example of that,
> and the fact that it needs an IsUnderPostmaster check is another.
Hmm, I assumed it was important that the pg_stat_statements file is
loaded later in the startup sequence, and that's why it was in
pgss_shmem_startup(). But now that I look at it, I don't think there was
any grand plan, shmem_startup_hook was just the only easy way to get
control during postmaster startup, after shmem initialization.
So I think we can move that code to the init_fn callback with the new
API, and that gets rid of shmem_startup_hook in pg_stat_statements. See
attached (v6-0005-move-pgss-shmem_startup-hook-code-into-the-new-in.patch).
> We should somehow try to make it clear what happens with this new
> mechanism if a module is loaded via shared_preload_libraries vs. if
> it's loaded via LOAD or session_preload_libraries or whatever. Writing
> modules that don't require shared_preload_libraries is a boon to
> users, because they can be added to a production system without a
> server restart. I wonder whether this new facility handles that case
> better, worse, or the same as existing facilities.
Pretty much the same I think. Point taken that it could be documented
better.
The old documentation for ShmemInitStruct() assumed that it would be
used from shared_preload_libraries, it didn't. With the new API, I tried
to document how it behaves when used outside shared_preload_libraries,
but still focused on how using it from shared_preload_libraries. I'm not
sure it helped. Perhaps the after-startup behavior should be put in a
separate section.
>> One such wrinkle with ShmemRegisterStruct() in the patch now is that
>> it's harder to do initialization that touches multiple structs or hash
>> tables. Currently the callbacks are called in the same order that the
>> structs are registered, so you can do all the initialization in the last
>> struct's callback. The single pair of shmem_request/startup_hooks per
>> module was more clear in that aspect. Fortunately, that kind of
>> cross-struct dependencies are pretty rare. So I think it's fine. (The
>> order that the callbacks are called needs be documented explicitly though).
>>
>> If we want to improve on that, one idea would be to introduce a
>> ShmemRegisterCallbacks() function to register callbacks that are not
>> tied to any particular struct and are called after all the per-struct
>> callbacks.
>
> I think the whole idea of ShmemInitHash() and ShmemInitStruct() is
> relatively poorly designed in this regard. Ideally, I want to
> initialize all the shared memory that belongs to me, and that might
> include arbitrary data structures, but the current design kind of
> thinks that I want one struct or one hash table and nothing else. If
> we're redesigning this mechanism, it would sure be nice to improve on
> that.
>
> Looking just at the pg_stat_statements changes, I think my overall
> view on this right now is that it's not terrible, but I'm also not
> that happy about introducing yet another way to do it for this amount
> of gain. To me, it doesn't yet rise to the level of a clear win.
So here's another idea (not yet implemented in the attached patch):
instead of thinking in terms of individual shmem structs and hashes,
let's introduce a concept of a "subsystem" that ties them together:
static pgss_subsystem_desc = {
.name = "pg_stat_statements",
.shmem_structs = {
{
.ptr = (void **) &pgss,
.size = sizeof(pgssSharedState),
},
},
.shmem_hashes = {
{
.ptr = &pgss_hash,
.init_size = 0, /* set from 'pgss_max' */
.max_size = 0, /* set from 'pgss_max' */
.hash_info.keysize = sizeof(pgssHashKey),
.hash_info.entrysize = sizeof(pgssEntry),
.hash_flags = HASH_ELEM | HASH_BLOBS,
},
},
/* called after the shmem structs and hashes have been allocated */
.init_fn = pgss_shmem_init,
}
void
_PG_init(void)
{
...
RegisterSubsystem(&pgss_subsystem_desc);
}
We could add more callbacks that get called at different times. For
example the callback that would get called before shared memory is
allocated, which could adjust the size according to MaxBackends. That
would fully replace shmem_request_hook. Or a callback that would get
called later in the startup sequence, if we wanted to e.g. load the
pg_stat_statements file later in startup.
This would be a natural place for other resources in future too. We
could support declaring "named lwlock tranches" here to replace
RequestNamedLWLockTranche() for example, although I think it's still
better to encourage embedding the LWLock in the struct instead.
_PG_init in pg_stat_statements still does a lot more than register that
struct. It declares the GUCs and installs other hooks for example. We
could perhaps move those to the subsystem descriptor too, although I'm
not sure if that's worth the code churn.
- Heikki
Attachments:
[text/x-patch] v6-0001-Test-pg_stat_statements-across-crash-restart.patch (1.4K, 2-v6-0001-Test-pg_stat_statements-across-crash-restart.patch)
download | inline diff:
From 54e16964f453b9e572e1ed6496ab05f2fceb75a3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 16 Mar 2026 16:41:17 +0200
Subject: [PATCH v6 1/6] Test pg_stat_statements across crash restart
Add 'pg_stat_statements' to the crash restart test, to test that
shared memory and LWLock initialization works across crash restart in
a library listed in shared_preload_libraries.
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/test/recovery/t/013_crash_restart.pl | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index 20d648ad6af..960e5462b49 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -21,6 +21,15 @@ my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
my $node = PostgreSQL::Test::Cluster->new('primary');
$node->init(allows_streaming => 1);
+
+# Enable pg_stat_statements to test restart of shared_preload_libraries.
+$node->append_conf(
+ 'postgresql.conf',
+ qq{shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.max = 50000
+compute_query_id = 'regress'
+});
+
$node->start();
# by default PostgreSQL::Test::Cluster doesn't restart after a crash
--
2.47.3
[text/x-patch] v6-0002-Refactor-ShmemIndex-initialization.patch (7.0K, 3-v6-0002-Refactor-ShmemIndex-initialization.patch)
download | inline diff:
From a5dd2a2c09b28c25c3679b27cc0ba5af31f00c45 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 16 Mar 2026 20:35:36 +0200
Subject: [PATCH v6 2/6] Refactor ShmemIndex initialization
Initialize the ShmemIndex hash table in InitShmemAllocator() already,
removing the need for the separate InitShmemIndex() step.
---
src/backend/storage/ipc/ipci.c | 8 +-
src/backend/storage/ipc/shmem.c | 131 ++++++++++++--------------------
src/include/storage/shmem.h | 1 -
3 files changed, 49 insertions(+), 91 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a4785daf1e5..3d3f153809b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -250,16 +250,10 @@ static void
CreateOrAttachShmemStructs(void)
{
/*
- * Now initialize LWLocks, which do shared memory allocation and are
- * needed for InitShmemIndex.
+ * Now initialize LWLocks, which do shared memory allocation.
*/
CreateLWLocks();
- /*
- * Set up shmem.c index hashtable
- */
- InitShmemIndex();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 0424c445723..dce355e6683 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -124,6 +124,12 @@ Datum pg_numa_available(PG_FUNCTION_ARGS);
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
+ Size offset;
+ int64 hash_size;
+ HASHCTL info;
+ int hash_flags;
+ size_t size;
+
Assert(seghdr != NULL);
/*
@@ -137,41 +143,54 @@ InitShmemAllocator(PGShmemHeader *seghdr)
ShmemBase = seghdr;
ShmemEnd = (char *) ShmemBase + seghdr->totalsize;
+ /*
+ * Allocations after this point should go through ShmemAlloc, which
+ * expects to allocate everything on cache line boundaries. Make sure the
+ * first allocation begins on a cache line boundary.
+ */
+ offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
+ if (offset > seghdr->totalsize)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of shared memory (%zu bytes requested)",
+ offset)));
+
+ ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ ShmemLock = &ShmemAllocator->shmem_lock;
+
#ifndef EXEC_BACKEND
Assert(!IsUnderPostmaster);
#endif
- if (IsUnderPostmaster)
+ if (!IsUnderPostmaster)
{
- PGShmemHeader *shmhdr = ShmemSegHdr;
-
- ShmemAllocator = (ShmemAllocatorData *) ((char *) shmhdr + shmhdr->content_offset);
- ShmemLock = &ShmemAllocator->shmem_lock;
+ SpinLockInit(&ShmemAllocator->shmem_lock);
+ ShmemAllocator->free_offset = offset;
}
- else
- {
- Size offset;
- /*
- * Allocations after this point should go through ShmemAlloc, which
- * expects to allocate everything on cache line boundaries. Make sure
- * the first allocation begins on a cache line boundary.
- */
- offset = CACHELINEALIGN(seghdr->content_offset + sizeof(ShmemAllocatorData));
- if (offset > seghdr->totalsize)
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("out of shared memory (%zu bytes requested)",
- offset)));
-
- ShmemAllocator = (ShmemAllocatorData *) ((char *) seghdr + seghdr->content_offset);
+ /*
+ * Create (or attach to) the shared memory index of shmem areas.
+ *
+ * This is the same initialization as ShmemInitHash() does, but we cannot
+ * use ShmemInitHash() here because it relies on ShmemIndex being already
+ * initialized.
+ */
+ hash_size = SHMEM_INDEX_SIZE;
- SpinLockInit(&ShmemAllocator->shmem_lock);
- ShmemLock = &ShmemAllocator->shmem_lock;
- ShmemAllocator->free_offset = offset;
- /* ShmemIndex can't be set up yet (need LWLocks first) */
- ShmemAllocator->index = NULL;
- ShmemIndex = (HTAB *) NULL;
+ info.keysize = SHMEM_INDEX_KEYSIZE;
+ info.entrysize = sizeof(ShmemIndexEnt);
+ info.dsize = info.max_dsize = hash_select_dirsize(hash_size);
+ info.alloc = ShmemAllocNoError;
+ hash_flags = HASH_ELEM | HASH_STRINGS | HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ if (!IsUnderPostmaster)
+ {
+ size = hash_get_shared_size(&info, hash_flags);
+ ShmemAllocator->index = (HASHHDR *) ShmemAlloc(size);
}
+ else
+ hash_flags |= HASH_ATTACH;
+ info.hctl = ShmemAllocator->index;
+ ShmemIndex = hash_create("ShmemIndex", hash_size, &info, hash_flags);
+ Assert(ShmemIndex != NULL);
}
/*
@@ -270,31 +289,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * InitShmemIndex() --- set up or attach to shmem index table.
- */
-void
-InitShmemIndex(void)
-{
- HASHCTL info;
-
- /*
- * Create the shared memory shmem index.
- *
- * Since ShmemInitHash calls ShmemInitStruct, which expects the ShmemIndex
- * hashtable to exist already, we have a bit of a circularity problem in
- * initializing the ShmemIndex itself. The special "ShmemIndex" hash
- * table name will tell ShmemInitStruct to fake it.
- */
- info.keysize = SHMEM_INDEX_KEYSIZE;
- info.entrysize = sizeof(ShmemIndexEnt);
-
- ShmemIndex = ShmemInitHash("ShmemIndex",
- SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
-}
-
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -383,38 +377,9 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
ShmemIndexEnt *result;
void *structPtr;
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ Assert(ShmemIndex != NULL);
- if (!ShmemIndex)
- {
- /* Must be trying to create/attach to ShmemIndex itself */
- Assert(strcmp(name, "ShmemIndex") == 0);
-
- if (IsUnderPostmaster)
- {
- /* Must be initializing a (non-standalone) backend */
- Assert(ShmemAllocator->index != NULL);
- structPtr = ShmemAllocator->index;
- *foundPtr = true;
- }
- else
- {
- /*
- * If the shmem index doesn't exist, we are bootstrapping: we must
- * be trying to init the shmem index itself.
- *
- * Notice that the ShmemIndexLock is released before the shmem
- * index has been initialized. This should be OK because no other
- * process can be accessing shared memory yet.
- */
- Assert(ShmemAllocator->index == NULL);
- structPtr = ShmemAlloc(size);
- ShmemAllocator->index = structPtr;
- *foundPtr = false;
- }
- LWLockRelease(ShmemIndexLock);
- return structPtr;
- }
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
/* look it up in the shmem index */
result = (ShmemIndexEnt *)
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..0de8a36429b 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -33,7 +33,6 @@ extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
-extern void InitShmemIndex(void);
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
--
2.47.3
[text/x-patch] v6-0003-Introduce-a-new-mechanism-for-registering-shared-.patch (52.1K, 4-v6-0003-Introduce-a-new-mechanism-for-registering-shared-.patch)
download | inline diff:
From 0238a32ff259b9e76235995e27ce6c2d08f8feee Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 16 Mar 2026 20:08:25 +0200
Subject: [PATCH v6 3/6] Introduce a new mechanism for registering shared
memory areas
Each shared memory area is registered with a "descriptor struct" that
contains parameters like name and size of the area. The descriptor
struct makes it easier to add optional fields in the future; the
additional fields can just be left as zeros.
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register the subsystem's shared memory needs. The
registration includes the size, which replaces the
[Subsystem]ShmemSize() calls, and a pointer to an initialization
callback function, which replaces the [Subsystem]ShmemInit()
calls. This is more ergonomic, as you only need to calculate the size
once, when you register the struct.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 126 ++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 11 +-
src/backend/storage/ipc/ipci.c | 50 +-
src/backend/storage/ipc/shmem.c | 735 +++++++++++++++++++-----
src/backend/tcop/postgres.c | 3 +
src/include/storage/ipc.h | 1 +
src/include/storage/shmem.h | 153 ++++-
src/tools/pgindent/typedefs.list | 4 +-
11 files changed, 863 insertions(+), 230 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..ecdd5fa544a 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRegisterStruct()</literal> or
+ <literal>ShmemRegisterHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..05d9ec7e33a 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,59 +3628,87 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
-<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
+ The shared library should register the shared memory allocation in
+ its <function>_PG_init</function> function. Here is an example:
<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_init(void *arg);
+
+static ShmemStructDesc MyShmemDesc = {
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .init_fn = my_shmem_init,
+ .ptr = (void **) &MyShmem,
+};
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ ShmemRegisterStruct(&MyShmemDesc);
}
-LWLockRelease(AddinShmemInitLock);
+
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
+{
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
+}
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>ShmemRegisterStruct()</function> call doesn't immediately
+ allocate or initialize the memory, it merely registers the space to be
+ allocated later in the startup sequence. If the size of the allocation
+ depends on <varname>MaxBackends</varname> or other variables that are
+ not yet initialized when <function>_PG_init()</function> is called, the
+ size can still be adjusted later by registering a
+ <literal>shmem_request_hook</literal> and changing the descriptor there.
+ When the memory is allocated, the registered
+ <function>init_fn</function> callback is called to initialize it.
+ </para>
+ <para>
+ The <function>init_fn()</function> callback is normally called at
+ postmaster startup, when no other processes are running yet and no
+ locking is required. However, if a shared memory area is registered
+ after system start, e.g. in an extension that is not in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>,
+ <function>ShmemRegisterStruct()</function> will immediately call
+ the <function>init_fn</function> callback. In that case, it holds a
+ lock internally that prevents concurrent shmem allocations.
+ </para>
+ <para>
+ On Windows, the <function>attach_fn</function> callback is additionally
+ called at every backend startup. It can be used for initializing
+ additional per-backend state related to the shared memory area that is
+ inherited via <function>fork()</function> on other systems. On other
+ platforms, the <function>attach_fn</function> callback is only called
+ for structs that are registered after system startup.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
@@ -3691,8 +3719,7 @@ LWLockRelease(AddinShmemInitLock);
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3738,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 68a42de0889..f5ae0cfa648 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -370,6 +370,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ RegisterShmemStructs();
+
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 30357845729..fecae827e5b 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -677,7 +678,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ RegisterShmemStructs();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..a3be3bbe3b6 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -958,6 +958,9 @@ PostmasterMain(int argc, char *argv[])
*/
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
@@ -3235,7 +3238,13 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterShmemStructs() here, we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 3d3f153809b..405c69655f0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -99,10 +99,12 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemRegisteredSize());
+
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
+
+ /* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
@@ -218,6 +220,10 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
/* Initialize subsystems */
CreateOrAttachShmemStructs();
@@ -231,6 +237,22 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterShmemStructs(void)
+{
+ /*
+ * TODO: Not used in any built-in subsystems yet. In the future, most of
+ * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
+ * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
+ * into the subsystems from here.
+ */
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
@@ -249,10 +271,26 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Now initialize LWLocks, which do shared memory allocation.
- */
- CreateLWLocks();
+#ifdef EXEC_BACKEND
+ if (IsUnderPostmaster)
+ {
+ /*
+ * ShmemAttachRegistered() uses LWLocks. Fortunately, LWLocks don't
+ * need any special attaching.
+ */
+ ShmemAttachRegistered();
+ }
+ else
+#endif
+ {
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function
+ * use LWLocks. (Nothing else can be running during startup, so they
+ * don't need to do any locking yet, but we nevertheless allow it.)
+ */
+ CreateLWLocks();
+ ShmemInitRegistered();
+ }
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index dce355e6683..d702db70f0c 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,48 +19,101 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size, but
- * their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted. However,
- * if one hash table grows very large and then shrinks, its space
- * cannot be redistributed to other tables. We could build a simple
- * hash bucket garbage collector if need be. Right now, it seems
- * unnecessary.
+ * Two kinds of shared memory data structures are handled by this module:
+ * fixed-size structures and hash tables. Fixed-size structures contain
+ * things like variables shared between all backend processes. Hash tables
+ * have a fixed maximum size, but their actual size can vary dynamically.
+ * When entries are added to the table, more space is allocated. Each shared
+ * data structure and hash has a string name to identify it, specified in the
+ * descriptor when its registered.
+ *
+ * Shared memory structs and hash tables should not be allocated after
+ * postmaster startup, although we do allow small allocations later for the
+ * benefit of extension modules that loaded after startup. Despite that
+ * allowance, extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also third way to allocate shared memory called Dynamic
+ * Shared Memory. See dsm.c for that facility. One big difference between
+ * traditional shared memory handled by shmem.c and dynamic shared memory is
+ * that traditional shared memory areas are mapped to the same address in all
+ * processes, so you can use normal pointers in shared memory structs. With
+ * Dynamic Shared Memory, you must use offsets or DSA pointers instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate a shared memory area, fill in the name, size, and any other
+ * options in ShmemStructDesc, and call ShmemRegisterStruct(). Leave any
+ * unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_init(void *arg);
+ *
+ * static ShmemStructDesc MyShmemDesc = {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .init_fn = my_shmem_init,
+ * .ptr = &MyShmem,
+ * };
+ *
+ * In the subsystem's initialization code (or in _PG_init() in extensions),
+ * call ShmemRegisterStruct(&MyShmemDesc).
+ *
+ * Lifecycle
+ * ---------
+ *
+ * RegisterShmemStructs() is called at postmaster startup before calculating
+ * the size of the global shared memory segment. Once all the registrations
+ * have been done, postmaster calls ShmemRegisteredSize() to add up the sizes
+ * of all the registered areas. After allocating the shared memory segment,
+ * postmaster calls ShmemInitRegistered(), which calls the init_fn callback,
+ * if any, of each registered area, in the order that they were registered.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls RegisterShmemStructs(), followed by
+ * ShmemAttachRegistered(), which re-establishes the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registring shmem areas.
+ * It pre-dates the ShmemRegisterStruct()/ShmemRegisterHash() functions, and
+ * should not be used in new code, but as of this writing it is still widely
+ * used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -79,6 +132,28 @@
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Array of registered shared memory areas.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork(). Extensions may register additional
+ * areas after startup, but only areas registered at postmaster startup are
+ * included in the estimate for the total memory needed for shared memory. If
+ * any non-trivial allocations are made after startup, there might not be
+ * enough shared memory available.
+ */
+static struct
+{
+ ShmemStructDesc *desc; /* registered descriptor */
+ bool legacy; /* legacy ShmemInitStruct/Hash entry? */
+} *registered_shmem_areas;
+static int num_registered_shmem_areas = 0;
+static int max_registered_shmem_areas = 0; /* allocated size of the array */
+
+/* estimated size of registered_shmem_areas (not a hard limit) */
+#define INITIAL_REGISTRY_SIZE (64)
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -96,8 +171,13 @@ typedef struct ShmemAllocatorData
slock_t shmem_lock;
} ShmemAllocatorData;
+static bool ShmemRegisterStructInternal(ShmemStructDesc *desc, bool legacy);
+static bool ShmemRegisterHashInternal(ShmemHashDesc *desc, bool legacy);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
/* shared memory global variables */
static PGShmemHeader *ShmemSegHdr; /* shared mem segment header */
@@ -106,20 +186,332 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+static bool shmem_initialized = false;
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for allocations
+ * after postmaster startup (not a hard limit)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
+/*
+ * ShmemRegisterStruct() --- register a shared memory struct
+ *
+ * Subsystems call this to register their shared memory needs. That should be
+ * done early in postmaster startup, before the shared memory segment has been
+ * created, so that the size can be included in the estimate for total amount
+ * of shared memory needed. We set aside a small amount of memory for
+ * allocations that happen later, for the benefit of non-preloaded extensions,
+ * but that should not be relied upon.
+ *
+ * In core subsystems, each subsystem's registration function is called from
+ * RegisterShmemStructs(). In extensions, this should be called from the
+ * _PG_init() function. In EXEC_BACKEND mode, this also needs to be called in
+ * each child process, to reattach and set the pointer to the shared memory
+ * area, usually in a global variable. Calling this from the _PG_init()
+ * initializer takes care of that too.
+ *
+ * When called during postmaster startup, before the shared memory has been
+ * allocated, the function merely remembers the registered descriptor, but the
+ * descriptor may still be changed later, until the shared memory segment has
+ * been allocated. That means that an extension may still modify the
+ * already-registered descriptor in the shmem_request_hook. A common example
+ * of when that's useful is when the size depends on MaxBackends: you can
+ * leave the size empty in the ShmemRegisterStruct() call and fill it later in
+ * the shmem_request_hook.
+ *
+ * Returns true if the struct was already initialized in shared memory and we
+ * merely attached to it.
+ */
+bool
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+ return ShmemRegisterStructInternal(desc, false);
+}
+
+static bool
+ShmemRegisterStructInternal(ShmemStructDesc *desc, bool legacy)
+{
+ bool found;
+
+ /* Check that it's not already registered in this process */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *existing = registered_shmem_areas[i].desc;
+
+ if (strcmp(existing->name, desc->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ desc->name),
+ errbacktrace()));
+ }
+
+ /* desc->ptr can be non-NULL when re-initializing after crash */
+ if (!IsUnderPostmaster && desc->ptr)
+ *desc->ptr = NULL;
+
+ /* Add the descriptor to the array, growing the array if needed */
+ if (num_registered_shmem_areas == max_registered_shmem_areas)
+ {
+ int new_size;
+
+ if (registered_shmem_areas)
+ {
+ new_size = max_registered_shmem_areas * 2;
+ registered_shmem_areas = repalloc(registered_shmem_areas,
+ new_size * sizeof(*registered_shmem_areas));
+ }
+ else
+ {
+ new_size = INITIAL_REGISTRY_SIZE;
+ registered_shmem_areas = MemoryContextAlloc(TopMemoryContext,
+ new_size * sizeof(*registered_shmem_areas));
+ }
+ max_registered_shmem_areas = new_size;
+ }
+ registered_shmem_areas[num_registered_shmem_areas].desc = desc;
+ registered_shmem_areas[num_registered_shmem_areas].legacy = legacy;
+ num_registered_shmem_areas++;
+
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ if (shmem_initialized)
+ {
+ ShmemIndexEnt *index_entry;
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (index_entry->size != desc->size)
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+ else
+ {
+ /*
+ * This is the first time. Initialize it like
+ * ShmemInitRegistered() would
+ */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+ if (desc->ptr)
+ *desc->ptr = index_entry->location;
+
+ /*
+ * XXX: if this errors out, the area is left in a half-initialized
+ * state
+ */
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+ }
+ else
+ found = false;
+
+ return found;
+}
+
+/*
+ * ShmemRegisteredSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemRegisteredSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the registered areas */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i].desc;
+
+ size = add_size(size, desc->size);
+ size = add_size(size, desc->extra_size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRegistered() --- allocate and initialize pre-registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRegistered(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(!shmem_initialized);
+
+ /*
+ * Initialize all the registered memory areas. There are no concurrent
+ * processes yet, so no need for locking.
+ */
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i].desc;
+ size_t allocated_size;
+ void *structPtr;
+ bool found;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_ENTER_NULL, &found);
+ if (!index_entry)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ if (found)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", desc->name);
+
+ /* allocate and initialize it */
+ structPtr = ShmemAllocRaw(desc->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
+ }
+ index_entry->size = desc->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ *(desc->ptr) = structPtr;
+ if (desc->init_fn)
+ desc->init_fn(desc->init_fn_arg);
+ }
+
+ shmem_initialized = true;
+}
+
+/*
+ * Call the attach_fn callbacks of all registered shmem areas
+ *
+ * This is called at backend startup, in EXEC_BACKEND mode.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ ShmemStructDesc *desc = registered_shmem_areas[i].desc;
+ bool found;
+ ShmemIndexEnt *result;
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, desc->name, HASH_FIND, &found);
+ if (!found)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+
+ if (desc->ptr)
+ *desc->ptr = result->location;
+ if (desc->attach_fn)
+ desc->attach_fn(desc->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_initialized = true;
+}
+#endif
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
@@ -130,6 +522,7 @@ InitShmemAllocator(PGShmemHeader *seghdr)
int hash_flags;
size_t size;
+ Assert(!shmem_initialized);
Assert(seghdr != NULL);
/*
@@ -174,7 +567,7 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
- hash_size = SHMEM_INDEX_SIZE;
+ hash_size = num_registered_shmem_areas + SHMEM_INDEX_ADDITIONAL_SIZE;
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
@@ -193,6 +586,38 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(ShmemIndex != NULL);
}
+/*
+ * Reset the shmem struct registry on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ int num_retained;
+
+ shmem_initialized = false;
+
+ /*
+ * Shared memory areas will not be registered again after a crash restart.
+ * We don't call RegisterShmemStructs() on crash restart, which would
+ * re-register core subsystems, and we don't reload
+ * shared_preload_libraries either.
+ *
+ * However, we do expect the legacy ShmemInitStruct() function will be
+ * called again for each area, so remove those from the registry.
+ */
+ num_retained = 0;
+ for (int i = 0; i < num_registered_shmem_areas; i++)
+ {
+ if (!registered_shmem_areas[i].legacy)
+ {
+ if (num_retained != i)
+ registered_shmem_areas[num_retained] = registered_shmem_areas[i];
+ num_retained++;
+ }
+ }
+ num_registered_shmem_areas = num_retained;
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -290,42 +715,20 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
+ * ShmemRegisterHash -- Register a shared memory hash table.
*
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * max_size is the estimated maximum number of hashtable entries. This is
- * not a hard limit, but the access efficiency will degrade if it is
- * exceeded substantially (since it's used to compute directory size and
- * the hash table buckets will get overfull).
- *
- * init_size is the number of hashtable entries to preallocate. For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Similar to ShmemRegisterStruct(), but registers a hash table instead of an
+ * opaque area.
*/
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 init_size, /* initial table size */
- int64 max_size, /* max size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
+bool
+ShmemRegisterHash(ShmemHashDesc *desc)
{
- bool found;
- void *location;
+ return ShmemRegisterHashInternal(desc, false);
+}
+static bool
+ShmemRegisterHashInternal(ShmemHashDesc *desc, bool legacy)
+{
/*
* Hash tables allocated in shared memory have a fixed directory; it can't
* grow or other backends wouldn't be able to find it. So, make sure we
@@ -333,26 +736,56 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
*
* The shared memory allocator must be specified too.
*/
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemAllocNoError;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
-
- /* look it up in the shmem index */
- location = ShmemInitStruct(name,
- hash_get_shared_size(infoP, hash_flags),
- &found);
+ desc->hash_info.dsize = desc->hash_info.max_dsize = hash_select_dirsize(desc->max_size);
+ desc->hash_info.alloc = ShmemAllocNoError;
+ desc->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* Set up the base struct descriptor */
+ memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+ desc->base_desc.name = desc->name;
+ desc->base_desc.size = hash_get_shared_size(&desc->hash_info, desc->hash_flags);
+ desc->base_desc.init_fn = shmem_hash_init;
+ desc->base_desc.init_fn_arg = desc;
+ desc->base_desc.attach_fn = shmem_hash_attach;
+ desc->base_desc.attach_fn_arg = desc;
/*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
+ * We need a stable pointer to hold the pointer to the shared memory. Use
+ * the one passed in the descriptor now. It will be replaced with the
+ * hash table header by init or attach function.
*/
- if (found)
- hash_flags |= HASH_ATTACH;
+ desc->base_desc.ptr = (void **) desc->ptr;
+
+ desc->base_desc.extra_size = hash_estimate_size(desc->max_size, desc->hash_info.entrysize) - desc->base_desc.size;
+
+ return ShmemRegisterStructInternal(&desc->base_desc, legacy);
+}
+
+static void
+shmem_hash_init(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
/* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
+ desc->hash_info.hctl = (HASHHDR *) *desc->base_desc.ptr;
- return hash_create(name, init_size, infoP, hash_flags);
+ *desc->ptr = hash_create(desc->name, desc->init_size, &desc->hash_info, hash_flags);
+}
+
+static void
+shmem_hash_attach(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+
+ /* Pass location of hashtable header to hash_create */
+ desc->hash_info.hctl = (HASHHDR *) *desc->base_desc.ptr;
+
+ *desc->ptr = hash_create(desc->name, desc->init_size, &desc->hash_info, hash_flags);
}
/*
@@ -367,82 +800,76 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
* Returns: pointer to the object. *foundPtr is set true if the object was
* already in the shmem index (hence, already initialized).
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterStruct() in new code!
*/
void *
ShmemInitStruct(const char *name, Size size, bool *foundPtr)
{
- ShmemIndexEnt *result;
- void *structPtr;
-
- Assert(ShmemIndex != NULL);
+ ShmemStructDesc *desc;
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ Assert(shmem_initialized);
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemStructDesc) + sizeof(void *));
+ desc->name = name;
+ desc->size = size;
+ desc->ptr = (void *) (((char *) desc) + sizeof(ShmemStructDesc));
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
-
- if (*foundPtr)
- {
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
+ *foundPtr = ShmemRegisterStructInternal(desc, true);
+ Assert(*desc->ptr != NULL);
+ return *desc->ptr;
+}
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
- }
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRegisterHash() in new code!
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ ShmemHashDesc *desc;
- LWLockRelease(ShmemIndexLock);
+ Assert(shmem_initialized);
- Assert(ShmemAddrIsValid(structPtr));
+ desc = MemoryContextAllocZero(TopMemoryContext, sizeof(ShmemHashDesc) + sizeof(HTAB *));
+ desc->name = name;
+ desc->init_size = init_size;
+ desc->max_size = max_size;
+ memcpy(&desc->hash_info, infoP, sizeof(HASHCTL));
+ desc->hash_flags = hash_flags;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ desc->ptr = (HTAB **) (((char *) desc) + sizeof(ShmemHashDesc));
- return structPtr;
+ ShmemRegisterHashInternal(desc, true);
+ return *desc->ptr;
}
-
/*
* Add two Size values, checking for overflow
*/
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..9bfc663f44a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4163,6 +4163,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* Initialize size of fast-path lock cache. */
InitializeFastPathLocks();
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterShmemStructs();
+
/*
* Give preloaded libraries a chance to request additional shared memory.
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterShmemStructs(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 0de8a36429b..08aa6380a43 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,18 +24,156 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * ShmemStructDesc describes a named area or struct in shared memory.
+ *
+ * Shared memory is reserved and allocated in a few phases at postmaster
+ * startup, and in EXEC_BACKEND mode, there's some extra work done to "attach"
+ * to them at backend startup. ShmemStructDesc contains all the information
+ * needed to manage the lifecycle.
+ *
+ * 'name' must be filled in before calling ShmemRegisterStruct(); all other
+ * fields can be adjusted later during postmaster startup, until the shared
+ * memory is allocated and init callback is called. Initialize any optional
+ * fields that you don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructDesc
+{
+ /*
+ * Name of the shared memory area. Required, must be unique, and must be
+ * set already before calling ShmemRegisterStruct().
+ */
+ const char *name;
+
+ /* Size of the shared memory area. Required. */
+ size_t size;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup. 'init_fn_arg'
+ * is an opaque argument passed to the callback.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup. We
+ * never do that in core code, but extensions might.
+ */
+ ShmemInitCallback attach_fn;
+ void *attach_fn_arg;
+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for a named shared memory hash table.
+ *
+ * Similar to ShmemStructDesc, but describes a shared memory hash table. Each
+ * hash table is backed by an allocated area, described by 'base_desc', but if
+ * 'max_size' is greater than 'init_size', it can also grow beyond the initial
+ * allocated area by allocating more hash entries from the global unreserved
+ * space.
+ */
+typedef struct ShmemHashDesc
+{
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRegisterHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+
+ /*
+ * Descriptor for the underlying "area". Callers of ShmemRegisterHash()
+ * do not need to touch this, it is filled in by ShmemRegisterHash() based
+ * on the hash table parameters.
+ */
+ ShmemStructDesc base_desc;
+} ShmemHashDesc;
/* shmem.c */
extern PGDLLIMPORT slock_t *ShmemLock;
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
+
+extern bool ShmemRegisterStruct(ShmemStructDesc *desc);
+extern bool ShmemRegisterHash(ShmemHashDesc *desc);
+
+/* legacy shmem allocation functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRegistered(void);
+#endif
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
@@ -44,19 +182,4 @@ extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* estimated size of the shmem index table (not a hard limit) */
-#define SHMEM_INDEX_SIZE (64)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 174e2798443..8fda83d51f0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2837,9 +2837,11 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
ShmemIndexEnt
+ShmemHashDesc
+ShmemStructDesc
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v6-0004-Convert-pg_stat_statements-to-use-the-new-interfa.patch (10.4K, 5-v6-0004-Convert-pg_stat_statements-to-use-the-new-interfa.patch)
download | inline diff:
From 855a61e67fa9c372e61971fb9ef87dbe10936780 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 23:00:00 +0200
Subject: [PATCH v6 4/6] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 161 ++++++++----------
1 file changed, 67 insertions(+), 94 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6cb14824ec3..8df4749a43b 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,13 +259,39 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc =
+{
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .init_fn = pgss_shmem_init,
+ .ptr = (void **) &pgss,
+};
+
+static ShmemHashDesc pgssSharedHashDesc =
+{
+ .name = "pg_stat_statements hash",
+ .ptr = &pgss_hash,
+
+ .init_size = 0, /* set from 'pgss_max' */
+ .max_size = 0, /* set from 'pgss_max' */
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
+
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
@@ -275,10 +301,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,7 +353,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
@@ -366,7 +387,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,11 +491,18 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs, including hash table
+ */
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+ pgssSharedHashDesc.init_size = pgss_max;
+ pgssSharedHashDesc.max_size = pgss_max;
+ ShmemRegisterHash(&pgssSharedHashDesc);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
@@ -494,31 +521,31 @@ _PG_init(void)
ProcessUtility_hook = pgss_ProcessUtility;
}
-/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
- */
static void
-pgss_shmem_request(void)
+pgss_shmem_init(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ int tranche_id;
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem_startup hook: Load any pre-existing statistics from file at
+ * postmaster startup. Also create and load the query-texts file, which is
+ * expected to exist (even if empty) while the module is enabled.
*/
static void
pgss_shmem_startup(void)
{
- bool found;
- HASHCTL info;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -531,54 +558,14 @@ pgss_shmem_startup(void)
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
+ if (IsUnderPostmaster)
+ return; /* nothing to do in backends */
/*
- * Create or attach to the shared memory state, including hash table
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
-
- /*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
- */
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
-
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
* Note: we don't bother with locks here, because there should be no other
@@ -1338,7 +1325,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1359,11 +1346,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1378,8 +1365,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1518,7 +1505,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1807,7 +1794,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2044,7 +2031,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
free(qbuffer);
}
@@ -2083,20 +2070,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2726,7 +2699,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2820,7 +2793,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v6-0005-move-pgss-shmem_startup-hook-code-into-the-new-in.patch (4.0K, 6-v6-0005-move-pgss-shmem_startup-hook-code-into-the-new-in.patch)
download | inline diff:
From 5b63794f4ae24a940aef1238ba503c51b0ee393a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 11:57:02 +0200
Subject: [PATCH v6 5/6] move pgss shmem_startup hook code into the new init_fn
callback
---
.../pg_stat_statements/pg_stat_statements.c | 56 +++++++++----------
1 file changed, 26 insertions(+), 30 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 8df4749a43b..8c84232f4d0 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -292,7 +292,6 @@ static ShmemHashDesc pgssSharedHashDesc =
static int nesting_level = 0;
/* Saved hook values */
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -353,7 +352,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -492,19 +490,18 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
/*
- * Register our shared memory needs, including hash table
+ * Register our shared memory needs. Register the hash table first, so
+ * that it's already initialized when pgss_shmem_init() is called.
*/
- ShmemRegisterStruct(&pgssSharedStateShmemDesc);
-
pgssSharedHashDesc.init_size = pgss_max;
pgssSharedHashDesc.max_size = pgss_max;
ShmemRegisterHash(&pgssSharedHashDesc);
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
/*
* Install hooks.
*/
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -521,11 +518,29 @@ _PG_init(void)
ProcessUtility_hook = pgss_ProcessUtility;
}
+/*
+ * Initialize our shared memory data structures at postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
+ */
static void
pgss_shmem_init(void *arg)
{
int tranche_id;
+ FILE *file = NULL;
+ FILE *qfile = NULL;
+ uint32 header;
+ int32 num;
+ int32 pgver;
+ int32 i;
+ int buffer_size;
+ char *buffer = NULL;
+ /*
+ * Initialize the shmem area with no statistics.
+ */
tranche_id = LWLockNewTrancheId("pg_stat_statements");
LWLockInitialize(&pgss->lock.lock, tranche_id);
pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
@@ -536,30 +551,9 @@ pgss_shmem_init(void *arg)
pgss->gc_count = 0;
pgss->stats.dealloc = 0;
pgss->stats.stats_reset = GetCurrentTimestamp();
-}
-
-/*
- * shmem_startup hook: Load any pre-existing statistics from file at
- * postmaster startup. Also create and load the query-texts file, which is
- * expected to exist (even if empty) while the module is enabled.
- */
-static void
-pgss_shmem_startup(void)
-{
- FILE *file = NULL;
- FILE *qfile = NULL;
- uint32 header;
- int32 num;
- int32 pgver;
- int32 i;
- int buffer_size;
- char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- if (IsUnderPostmaster)
- return; /* nothing to do in backends */
+ /* The hash table must be initialized already */
+ Assert(pgss_hash != NULL);
/*
* Set up a shmem exit hook to dump the statistics to disk on postmaster
@@ -568,6 +562,8 @@ pgss_shmem_startup(void)
on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
--
2.47.3
[text/x-patch] v6-0006-Use-the-new-mechanism-in-a-few-core-subsystems.patch (40.8K, 7-v6-0006-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From 7160ae00657d5e40e74b3b0f31de14435db73190 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 23:01:44 +0200
Subject: [PATCH v6 6/6] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/varsup.c | 33 ++---
src/backend/storage/ipc/dsm.c | 61 ++++----
src/backend/storage/ipc/dsm_registry.c | 34 ++---
src/backend/storage/ipc/ipci.c | 50 +++----
src/backend/storage/ipc/pmsignal.c | 49 ++++---
src/backend/storage/ipc/procarray.c | 122 ++++++++--------
src/backend/storage/ipc/procsignal.c | 63 ++++-----
src/backend/storage/ipc/sinvaladt.c | 36 ++---
src/backend/storage/lmgr/proc.c | 178 ++++++++++++------------
src/backend/utils/activity/wait_event.c | 94 ++++++-------
src/include/access/transam.h | 3 +-
src/include/storage/dsm.h | 3 +-
src/include/storage/dsm_registry.h | 3 +-
src/include/storage/pmsignal.h | 3 +-
src/include/storage/proc.h | 2 +-
src/include/storage/procarray.h | 3 +-
src/include/storage/procsignal.h | 3 +-
src/include/storage/sinvaladt.h | 3 +-
src/include/utils/wait_event.h | 3 +-
19 files changed, 366 insertions(+), 380 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..84e6c90f4fa 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,32 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemInit(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+static ShmemStructDesc TransamVariablesShmemDesc = {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .init_fn = VarsupShmemInit,
+ .ptr = (void **) &TransamVariables,
+};
/*
* Initialization of shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
{
- return sizeof(TransamVariablesData);
+ ShmemRegisterStruct(&TransamVariablesShmemDesc);
}
-void
-VarsupShmemInit(void)
-{
- bool found;
+static void
+VarsupShmemInit(void *arg)
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+{
+ memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..52b4a1e017f 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -110,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_init(void *);
+
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+ .name = "Preallocated DSM",
+ .size = 0, /* calculated later */
+ .init_fn = dsm_main_space_init,
+ .ptr = &dsm_main_space_begin,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -464,42 +473,36 @@ dsm_set_control_handle(dsm_handle h)
#endif
/*
- * Reserve some space in the main shared memory segment for DSM segments.
- */
-size_t
-dsm_estimate_size(void)
-{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
-}
-
-/*
- * Initialize space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
void
-dsm_shmem_init(void)
+dsm_shmem_register(void)
{
- size_t size = dsm_estimate_size();
- bool found;
+ size_t size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ dsm_main_space_shmem_desc.size = size;
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
+
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..080e30648cc 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -57,6 +57,16 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryCtxShmemInit(void *arg);
+
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .init_fn = DSMRegistryCtxShmemInit,
+ .ptr = (void **) &DSMRegistryCtx,
+};
+
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +124,17 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 405c69655f0..40ceeddae58 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemRegisteredSize());
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
-
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -120,11 +115,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -138,7 +129,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
@@ -246,11 +236,28 @@ void
RegisterShmemStructs(void)
{
/*
- * TODO: Not used in any built-in subsystems yet. In the future, most of
- * the calls *ShmemInit() calls in CreateOrAttachShmemStructs(), and
- * *ShmemSize() calls in CalculateShmemSize() will be replaced by calls
- * into the subsystems from here.
+ * TODO: In the future, all the calls *ShmemInit() calls in
+ * CreateOrAttachShmemStructs(), and all the *ShmemSize() calls in
+ * CalculateShmemSize() will be replaced by calls into the subsystems from
+ * here.
+ */
+
+ /*
+ * Note: there are some inter-dependencies between these, so the order of
+ * some of these matter.
*/
+
+ DSMRegistryShmemRegister();
+ dsm_shmem_register();
+
+ ProcGlobalShmemRegister();
+ VarsupShmemRegister();
+ ProcArrayShmemRegister();
+ SharedInvalShmemRegister();
+ PMSignalShmemRegister();
+ ProcSignalShmemRegister();
+
+ WaitEventCustomShmemRegister();
}
/*
@@ -292,13 +299,9 @@ CreateOrAttachShmemStructs(void)
ShmemInitRegistered();
}
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -321,23 +324,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
@@ -356,7 +349,6 @@ CreateOrAttachShmemStructs(void)
SyncScanShmemInit();
AsyncShmemInit();
StatsShmemInit();
- WaitEventCustomShmemInit();
InjectionPointShmemInit();
AioShmemInit();
WaitLSNShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..d7463df73ab 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -83,6 +83,15 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemInit(void *);
+
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* calculated later */
+ .init_fn = PMSignalShmemInit,
+ .ptr = (void **) &PMSignalState,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
{
Size size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = offsetof(PMSignalData, PMChildFlags);
+ size = add_size(size, mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ PMSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&PMSignalShmemDesc);
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ /* initialize all flags to zeroes */
+ Assert(PMSignalState);
+ MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 0f913897acc..5daf33d5323 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -103,6 +103,19 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+ .name = "Proc Array",
+ .size = 0, /* calculated later */
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+ .ptr = (void **) &procArray,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,25 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+ .name = "KnownAssignedXids",
+ .size = 0, /* calculated later */
+ .init_fn = NULL, /* no initialization needed */
+ .ptr = (void **) &KnownAssignedXids,
+};
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+ .name = "KnownAssignedXidsValid",
+ .size = 0, /* calculated later */
+ .init_fn = NULL, /* no initialization needed */
+ .ptr = (void **) &KnownAssignedXidsValid,
+};
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,18 +401,18 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ /* Register the ProcArray shared structure */
+ ProcArrayShmemDesc.size =
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ ShmemRegisterStruct(&ProcArrayShmemDesc);
/*
* During Hot Standby processing we have a data structure called
@@ -405,64 +432,41 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ KnownAssignedXidsShmemDesc.size =
+ mul_size(sizeof(TransactionId),
+ TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+ KnownAssignedXidsValidShmemDesc.size =
+ mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
}
-
- return size;
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..a0c94ddd77c 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -105,7 +105,18 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemInit(void *arg);
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+ .name = "ProcSignal",
+ .size = 0, /* calculated later */
+ .init_fn = ProcSignalShmemInit,
+ .ptr = (void **) &ProcSignal,
+};
+
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -113,51 +124,37 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ProcSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcSignalShmemDesc);
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..b9df3b84de9 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,8 +203,17 @@ typedef struct SISeg
*/
#define NumProcStateSlots (MaxBackends + NUM_AUXILIARY_PROCS)
+static void SharedInvalShmemInit(void *arg);
+
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static ShmemStructDesc SharedInvalShmemDesc = {
+ .name = "shmInvalBuffer",
+ .size = 0, /* calculated later */
+ .init_fn = SharedInvalShmemInit,
+ .ptr = (void **) &shmInvalBuffer,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
{
Size size;
@@ -223,26 +233,16 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ SharedInvalShmemDesc.size = size;
+ ShmemRegisterStruct(&SharedInvalShmemDesc);
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 8f5ce0e2a8a..f880aba9be9 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -69,9 +69,41 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemInit(void *arg);
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .init_fn = ProcGlobalShmemInit,
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRegistered()
+ * has been called.
+ */
+ .ptr = (void **) &ProcGlobal,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+ .name = "PGPROC structures",
+ .size = 0, /* calculated later */
+ .ptr = (void **) &tmpAllProcs,
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+ .name = "Fast-Path Lock Array",
+ .size = 0, /* calculated later */
+ .ptr = (void **) &tmpFastPathLockArray,
+};
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -81,24 +113,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -106,8 +120,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -123,25 +135,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -156,7 +149,50 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRegister -
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+void
+ProcGlobalShmemRegister(void)
+{
+ Size size = 0;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+ ProcGlobalAllProcsShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+ FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+ ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+ /*
+ * Register the ProcGlobal shared structure last. Its init callback
+ * initializes the others too.
+ */
+ ShmemRegisterStruct(&ProcGlobalShmemDesc);
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -175,36 +211,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
-
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -217,23 +240,12 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -242,7 +254,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -257,31 +269,25 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index aca2c8fc742..93e87a68fdf 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -79,6 +79,30 @@ typedef struct WaitEventCustomEntryByName
uint32 wait_event_info;
} WaitEventCustomEntryByName;
+static ShmemHashDesc WaitEventCustomHashByInfoDesc =
+{
+ .name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+};
+
+static ShmemHashDesc WaitEventCustomHashByNameDesc =
+{
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+};
/* dynamic allocation counter for custom wait events */
typedef struct WaitEventCustomCounterData
@@ -90,6 +114,16 @@ typedef struct WaitEventCustomCounterData
/* pointer to the shared memory */
static WaitEventCustomCounterData *WaitEventCustomCounter;
+static void WaitEventCustomCounterDataShmemInit(void *arg);
+
+static ShmemStructDesc WaitEventCustomCounterShmemDesc =
+{
+ .name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .init_fn = WaitEventCustomCounterDataShmemInit,
+ .ptr = (void **) &WaitEventCustomCounter,
+};
+
/* first event ID of custom wait events */
#define WAIT_EVENT_CUSTOM_INITIAL_ID 1
@@ -97,60 +131,22 @@ static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+void
+WaitEventCustomShmemRegister(void)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRegisterStruct(&WaitEventCustomCounterShmemDesc);
+ ShmemRegisterHash(&WaitEventCustomHashByInfoDesc);
+ ShmemRegisterHash(&WaitEventCustomHashByNameDesc);
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomCounterDataShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..2dfc8b0f85f 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..061bf725f88 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,8 +26,7 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
+extern void dsm_shmem_register(void);
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index bf3094f0f7d..d7a4c57f74c 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -549,7 +549,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index c5ab1574fe3..572516c4e21 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..d2344b1cbb3 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -63,8 +63,7 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86fc8637f5e 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,7 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
+extern void WaitEventCustomShmemRegister(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-19 13:36 Robert Haas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-19 13:36 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Thu, Mar 19, 2026 at 6:31 AM Heikki Linnakangas <[email protected]> wrote:
> We could add more callbacks that get called at different times. For
> example the callback that would get called before shared memory is
> allocated, which could adjust the size according to MaxBackends. That
> would fully replace shmem_request_hook. Or a callback that would get
> called later in the startup sequence, if we wanted to e.g. load the
> pg_stat_statements file later in startup.
>
> This would be a natural place for other resources in future too. We
> could support declaring "named lwlock tranches" here to replace
> RequestNamedLWLockTranche() for example, although I think it's still
> better to encourage embedding the LWLock in the struct instead.
>
> _PG_init in pg_stat_statements still does a lot more than register that
> struct. It declares the GUCs and installs other hooks for example. We
> could perhaps move those to the subsystem descriptor too, although I'm
> not sure if that's worth the code churn.
Without taking a strong position on this particular design idea, I
kind of wonder if we should be going the opposite direction: instead
of bundling more and more things into descriptors, try to let people
write declarative code that does whatever it is that they want to do.
I think descriptors are pretty limiting as a concept, because they can
only do the exact things that they're designed to do. For instance, I
find the change to use LWLockPadded rather than LWLock * in
pgssSharedState to be a clear improvement, because I'd rather have
fewer objects and less pointer-chasing. Now, that LWLockPadded is
going to need to be initialized. I would rather do that by writing
LWLockInitialize(&pgss->lock.lock, tranche_id) as you did than by
adding something to a descriptor, since doing the latter is almost
certainly going to be a less intuitive syntax for the same thing and
I'm going to have to spend time verifying that whatever locution I'm
compelled to write actually does what I want. And if somebody adds
"really light weight locks" to the system then we'll have to add
RLWLockInitialize() to the things that the descriptor system knows how
to do, and if for some reason I want to do
LWLockInitialize(&mythingy->lock.lock, some_random_condition ?
this_tranche_id : that_tranche_id), the descriptor system will
probably become an annoying straightjacket. Or if that example isn't
compelling enough, imagine that I have an array of structs each of
which contains an LWLockPadded and I need to go loop over the array
and do all the initializations. Or maybe space is at a premium and I
want to use LWLock rather than LWLockPadded. Or maybe something else.
Code is just more flexible than having to go through descriptors,
which is why a lot of modern languages go to a great deal of trouble
to make closures a first-class concept.
Let me take a step back and say what I think the problems in this area
are, to see whether we agree on the basics. I suspect the reason
you've undertaken this project is the fact that, currently, requesting
shared memory and allocating it are totally decoupled. The number of
bytes you request and the number of bytes you actually allocated could
be totally different, and then some completely unrelated subsystem can
break because it's allocating last, and even though it requested the
bytes it wanted to allocate, some other subsystem under-requested and
now the bytes this subsystem wants are not actually available.
Tracking this kind of thing down can be a giant pain in the rear end,
bordering on impossible IME. Also, if we want to be able to resize
stuff in shared memory in some happy future, the need for precise
tracking of these sorts of things presumably goes way up, although the
exact details of that are not altogether clear to me. Furthermore, as
you point out, even if everyone behaves themselves and requests and
allocates the same number of bytes, that's still annoying if it means
redoing some computation.
I think the answer to this problem is to make requests into named
objects. You're not allowed to request a number of BYTES of shared
memory any more; you have to request "the shared memory bytes for the
object named XYZ". So instead of
RequestAddinShmemSpace(pgss_memsize()), you would say something like
RequestAddinShmemSpace("pg_stat_staements", pgss_memsize()) and then
later instead of saying pgss = ShmemInitStruct("pg_stat_statements",
sizeof(pgssSharedState), &found), you say pgss =
ShmemInitStruct("pg_stat_statements", &size, &found).
The other big problem that I think we have in this area is that it's
unclear what you're allowed to do in _PG_init() vs. some other
callback, and sometimes you need IsUnderPostmaster checks or
IsPostmasterEnvironment checks or
process_shared_preload_libraries_in_progress checks. From my point of
view, good goals would include (1) moving as much logic as possible
into _PG_init() vs. having to put it elsewhere and (2) removing as
many conditional checks as possible from it and aiming for _PG_init()
functions that just run from start to finish in all cases.
What _PG_init() already does pretty well is allow you to do
per-backend setup. For instance, pg_plan_advice needs no shared
resources, so _PG_init() was easy to write and, IMHO, easy to read.
It's requesting diverse types of resources -- GUCs, an EXPLAIN
extension ID, an EXPLAIN option, and hooks, but it can just do all of
those things one after another with no conditional logic and, IMHO,
life is great. We fall down a little bit because of the fact that
PGC_POSTMASTER GUCs can't be added after startup -- see autoprewarm.c,
for instance, which calls out that problem implicitly; and I suspect
that issue is also why pg_stat_statements has the
process_shared_preload_libraries_in_progress check at the top, because
it looks to me like everything else that the function does would be
completely fine to do later. So maybe we could adjust
DefineCustomBLAHVariable to do nothing if there's a PGC_POSTMASTER
variable requested and it's too late to create one, instead of blowing
up. Or create the variable but attach some property to it that causes
it to generate an error when set, e.g. ERROR: pg_stat_statements.max
cannot be changed now because the library that created it was not
included in shared_preload_libraries at startup time (wordsmithing
likely needed).
Shared resources do require some split-up of initialization: as you
point out, if _PG_init() is called before we know MaxBackends, then we
can't even size data structures who size depends on that quantity yet,
and we certainly can't initialize anything, because shared memory
might not have been created yet. I don't think we can completely avoid
the need for callbacks here, but... just spitballing, how about
something like this:
extern void DefineShmemRegion(char *name, size_t size, void
**localptr, void (*init_callback)(void *), int flags);
extern void DefineShmemRegionDynamic(char *name, size_t
(*sizing_callback)(void *), void **localptr, void
(*init_callback)(void *), int flags);
extern void *GetShmemRegion(char *name);
#define DSR_FLAGS_NO_SLOP 0x01
#define DSR_FLAGS_DSA_OK 0x02
If DefineShmemRegion() or DefineShmemRegionDynamic() is called at
shared_preload_libraries time, it arranges to increase the size of the
main shared memory segment by the given amount, or the computed amount
(for things that depend on MaxBackends). Then, once the main shared
memory segment is created, it invokes the init_callback and sets
*localptr. If either of these functions are called after the main
shared memory segment has been created, they check for an existing
allocation, and if one is found, they just set *localptr. (They can
actually probably exit quickly if *localptr is already set.)
Otherwise, they try to allocate from DSA if DSR_FLAGS_DSA_OK is given,
or else from the slop space unless DSR_FLAGS_NO_SLOP is given. If that
works, they then call the init_callback under a suitable lock and set
*localptr. If not, they fail silently. Functions that need to use the
local pointers do something like this:
if (unlikely(pgss == NULL))
pgss = GetShmemRegion("pg_stat_statements");
...which throws a suitable error -- not just a generic one that the
region doesn't exist, but something that's sensitive to different
failure conditions: the region was never registered, the region was
registered after shared_preload_libraries time and not enough slop
space remains, or whatever. GetShmemRegion() could even retry the
initialization in certain cases, e.g. if DSA is OK and we previously
were called too early in startup to try DSA, we can try now, or if DSA
allocation failed due to OOM, we can try again.
I *think* this design gets rid of all the IsUnderPostmaster and
shared_preload_libraries_in_progress checks in individual subsystems,
and all the use of shmem startup hooks. You just ask for what you want
and if there's a way to get it, the system gives it to you, and if
there's not, it generates an error at the latest possible time, and
also tries to self-heal if that's reasonable. If you do load your
module in shared_preload_libraries, then by the time main shared
memory initialization completes in the postmaster, everyone's localptr
values (like pgss) will be initialized, but if it happens to be an
EXEC_BACKEND build, those same calls will also happen in every
postmaster child, and will automatically re-find the shared memory
areas and reinitialize all the pointers. If you load your module
later, your localptr values will hopefully be initialized by the end
of _PG_init(), but if that doesn't work out, then the
unlikely-protected calls to GetShmemRegion will produce suitable
errors at a suitable time. And I think it all works out nicely in a
standalone backend, too.
This is all kind of a brain dump and is not fully thought-through and
might be riddled with cognitive errors, but what I'm sort of trying to
convey is where I think the complexity in the current system comes
from (which is that we require every subsystem/extension author to
know how the sausage is made, and we don't enforce consistency between
requests and allocations) and what I don't really like about
descriptors as a solution (which is that they are harder to read than
imperative code and can interfere with cases where somebody wants to
do something slightly different than what the descriptor-designer had
in mind). I hope that some of it is helpful to you...
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-19 14:34 Heikki Linnakangas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-19 14:34 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 19/03/2026 15:36, Robert Haas wrote:
> Without taking a strong position on this particular design idea, I
> kind of wonder if we should be going the opposite direction: instead
> of bundling more and more things into descriptors, try to let people
> write declarative code that does whatever it is that they want to do.
> I think descriptors are pretty limiting as a concept, because they can
> only do the exact things that they're designed to do. For instance, I
> find the change to use LWLockPadded rather than LWLock * in
> pgssSharedState to be a clear improvement, because I'd rather have
> fewer objects and less pointer-chasing. Now, that LWLockPadded is
> going to need to be initialized. I would rather do that by writing
> LWLockInitialize(&pgss->lock.lock, tranche_id) as you did than by
> adding something to a descriptor, since doing the latter is almost
> certainly going to be a less intuitive syntax for the same thing and
> I'm going to have to spend time verifying that whatever locution I'm
> compelled to write actually does what I want. And if somebody adds
> "really light weight locks" to the system then we'll have to add
> RLWLockInitialize() to the things that the descriptor system knows how
> to do, and if for some reason I want to do
> LWLockInitialize(&mythingy->lock.lock, some_random_condition ?
> this_tranche_id : that_tranche_id), the descriptor system will
> probably become an annoying straightjacket. Or if that example isn't
> compelling enough, imagine that I have an array of structs each of
> which contains an LWLockPadded and I need to go loop over the array
> and do all the initializations. Or maybe space is at a premium and I
> want to use LWLock rather than LWLockPadded. Or maybe something else.
> Code is just more flexible than having to go through descriptors,
> which is why a lot of modern languages go to a great deal of trouble
> to make closures a first-class concept.
Sure, descriptors are restricting. I'm not proposing to move everything
into descriptors.
> Let me take a step back and say what I think the problems in this area
> are, to see whether we agree on the basics. I suspect the reason
> you've undertaken this project is the fact that, currently, requesting
> shared memory and allocating it are totally decoupled. The number of
> bytes you request and the number of bytes you actually allocated could
> be totally different, and then some completely unrelated subsystem can
> break because it's allocating last, and even though it requested the
> bytes it wanted to allocate, some other subsystem under-requested and
> now the bytes this subsystem wants are not actually available.
> Tracking this kind of thing down can be a giant pain in the rear end,
> bordering on impossible IME. Also, if we want to be able to resize
> stuff in shared memory in some happy future, the need for precise
> tracking of these sorts of things presumably goes way up, although the
> exact details of that are not altogether clear to me. Furthermore, as
> you point out, even if everyone behaves themselves and requests and
> allocates the same number of bytes, that's still annoying if it means
> redoing some computation.
Agreed, yes, that's what I'm trying to address.
> I think the answer to this problem is to make requests into named
> objects. You're not allowed to request a number of BYTES of shared
> memory any more; you have to request "the shared memory bytes for the
> object named XYZ". So instead of
> RequestAddinShmemSpace(pgss_memsize()), you would say something like
> RequestAddinShmemSpace("pg_stat_staements", pgss_memsize()) and then
> later instead of saying pgss = ShmemInitStruct("pg_stat_statements",
> sizeof(pgssSharedState), &found), you say pgss =
> ShmemInitStruct("pg_stat_statements", &size, &found).
Yeah, that's a little better than what we have today.
> The other big problem that I think we have in this area is that it's
> unclear what you're allowed to do in _PG_init() vs. some other
> callback, and sometimes you need IsUnderPostmaster checks or
> IsPostmasterEnvironment checks or
> process_shared_preload_libraries_in_progress checks. From my point of
> view, good goals would include (1) moving as much logic as possible
> into _PG_init() vs. having to put it elsewhere and (2) removing as
> many conditional checks as possible from it and aiming for _PG_init()
> functions that just run from start to finish in all cases.
Agreed.
> What _PG_init() already does pretty well is allow you to do
> per-backend setup. For instance, pg_plan_advice needs no shared
> resources, so _PG_init() was easy to write and, IMHO, easy to read.
> It's requesting diverse types of resources -- GUCs, an EXPLAIN
> extension ID, an EXPLAIN option, and hooks, but it can just do all of
> those things one after another with no conditional logic and, IMHO,
> life is great. We fall down a little bit because of the fact that
> PGC_POSTMASTER GUCs can't be added after startup -- see autoprewarm.c,
> for instance, which calls out that problem implicitly; and ...
Agreed.
> I suspect
> that issue is also why pg_stat_statements has the
> process_shared_preload_libraries_in_progress check at the top, because
> it looks to me like everything else that the function does would be
> completely fine to do later.
I think a bigger problem is loading and saving the statistics file. The
file needs to be saved on postmaster exit, where do you do that if the
library was not in shared_preload_libraries?
> Shared resources do require some split-up of initialization: as you
> point out, if _PG_init() is called before we know MaxBackends, then we
> can't even size data structures who size depends on that quantity yet,
> and we certainly can't initialize anything, because shared memory
> might not have been created yet. I don't think we can completely avoid
> the need for callbacks here, but... just spitballing, how about
> something like this:
>
> extern void DefineShmemRegion(char *name, size_t size, void
> **localptr, void (*init_callback)(void *), int flags);
> extern void DefineShmemRegionDynamic(char *name, size_t
> (*sizing_callback)(void *), void **localptr, void
> (*init_callback)(void *), int flags);
> extern void *GetShmemRegion(char *name);
>
> #define DSR_FLAGS_NO_SLOP 0x01
> #define DSR_FLAGS_DSA_OK 0x02
>
> If DefineShmemRegion() or DefineShmemRegionDynamic() is called at
> shared_preload_libraries time, it arranges to increase the size of the
> main shared memory segment by the given amount, or the computed amount
> (for things that depend on MaxBackends). Then, once the main shared
> memory segment is created, it invokes the init_callback and sets
> *localptr. If either of these functions are called after the main
> shared memory segment has been created, they check for an existing
> allocation, and if one is found, they just set *localptr. (They can
> actually probably exit quickly if *localptr is already set.)
If you squint a little, this is pretty much the same as my descriptor
design. Let's start from your DefineShmemRegion function, but in order
to have some flexibility to add optional optional in the future, without
having to create DefineShmemRegionNew(), DefineShmemRegionExt() or
similar functions, let's pass the arguments in a struct. So instead of:
DefineShmemRegion("pg_stat_statements", sizeof(pgssSharedState), &pgss,
&pgss,pgss_shmem_init, 0);
you would call it like this:
DefineShmemRegion(&(ShmemStructDesc) {
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
.init_fn = pgss_shmem_init,
.flags = 0,
});
This flexibility will come handy as soon as we add the ability to resize.
Now if you rename DefineShmemRegion() to ShmemRegisterStruct(), this is
equivalent to my descriptor design. (One detail is whether you require
the descriptor struct to stay around after the call, or if
DefineShmemRegion/ShmemRegisterStruct will copy all the information)
> Otherwise, they try to allocate from DSA if DSR_FLAGS_DSA_OK is given,
> or else from the slop space unless DSR_FLAGS_NO_SLOP is given.
Ok, falling back to DSA is a new idea. You could implement that in
either design.
> If that
> works, they then call the init_callback under a suitable lock and set
> *localptr. If not, they fail silently. Functions that need to use the
> local pointers do something like this:
>
> if (unlikely(pgss == NULL))
> pgss = GetShmemRegion("pg_stat_statements");
>
> ...which throws a suitable error -- not just a generic one that the
> region doesn't exist, but something that's sensitive to different
> failure conditions: the region was never registered, the region was
> registered after shared_preload_libraries time and not enough slop
> space remains, or whatever. GetShmemRegion() could even retry the
> initialization in certain cases, e.g. if DSA is OK and we previously
> were called too early in startup to try DSA, we can try now, or if DSA
> allocation failed due to OOM, we can try again.
Ok, we could add GetShmemRegion() in either design. Do we have any place
where we'd use that though, instead the backend-private pointer global
variable? I can't think of any examples where we currently call
ShmemInitStruct() to get a pointer "on demand" like that.
In pg_stat_statements, this would replace these tests:
if (!pgss || !pgss_hash)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via
\"shared_preload_libraries\"")));
But I don't think pg_stat_statements could still allocate the region
after postmaster startup. GetShmemRegion() would just be a different way
of throwing that error.
> I *think* this design gets rid of all the IsUnderPostmaster and
> shared_preload_libraries_in_progress checks in individual subsystems,
> and all the use of shmem startup hooks. You just ask for what you want
> and if there's a way to get it, the system gives it to you, and if
> there's not, it generates an error at the latest possible time, and
> also tries to self-heal if that's reasonable. If you do load your
> module in shared_preload_libraries, then by the time main shared
> memory initialization completes in the postmaster, everyone's localptr
> values (like pgss) will be initialized, but if it happens to be an
> EXEC_BACKEND build, those same calls will also happen in every
> postmaster child, and will automatically re-find the shared memory
> areas and reinitialize all the pointers. If you load your module
> later, your localptr values will hopefully be initialized by the end
> of _PG_init(), but if that doesn't work out, then the
> unlikely-protected calls to GetShmemRegion will produce suitable
> errors at a suitable time. And I think it all works out nicely in a
> standalone backend, too.
>
> This is all kind of a brain dump and is not fully thought-through and
> might be riddled with cognitive errors, but what I'm sort of trying to
> convey is where I think the complexity in the current system comes
> from (which is that we require every subsystem/extension author to
> know how the sausage is made, and we don't enforce consistency between
> requests and allocations) and what I don't really like about
> descriptors as a solution (which is that they are harder to read than
> imperative code and can interfere with cases where somebody wants to
> do something slightly different than what the descriptor-designer had
> in mind). I hope that some of it is helpful to you...
Thanks, this hasn't changed my opinions, but I really appreciate
pressure-testing the design. I don't want to rewrite this again in a
year, because we didn't get it quite right.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-19 15:17 Robert Haas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 3 replies; 75+ messages in thread
From: Robert Haas @ 2026-03-19 15:17 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Thu, Mar 19, 2026 at 10:34 AM Heikki Linnakangas <[email protected]> wrote:
> > I suspect
> > that issue is also why pg_stat_statements has the
> > process_shared_preload_libraries_in_progress check at the top, because
> > it looks to me like everything else that the function does would be
> > completely fine to do later.
>
> I think a bigger problem is loading and saving the statistics file. The
> file needs to be saved on postmaster exit, where do you do that if the
> library was not in shared_preload_libraries?
Well, there's no way to install a hook in the postmaster in that case,
so you can't. But I'm not sure that justifies skipping everything
_PG_init() would have done. A problem with the status quo is that
every module author makes their own decision about how to handle the
s_p_l problem, and they don't all decide differently, even in our own
tree. For example, autoprewarm chooses to register the GUC that it can
while skipping the other one, while pg_stat_statements skips
everything, including hook installation. Maybe that's properly
considered flexibility that should be left to each individual author,
but to me it seems more like happenstance than anything else. I'd
favor a design that emphasizes severability - i.e. you always try to
do as much as possible, and defer errors until later. So you always
create the GUCs but then restrict setting them to values that you
can't support without a restart, instead of not creating them. You
install the hooks and then maybe they will have to no-op out. It's
just weird if you load a library and the GUCs aren't defined and
there's not even a diagnostic telling you why.
> If you squint a little, this is pretty much the same as my descriptor
> design. Let's start from your DefineShmemRegion function, but in order
> to have some flexibility to add optional optional in the future, without
> having to create DefineShmemRegionNew(), DefineShmemRegionExt() or
> similar functions, let's pass the arguments in a struct. So instead of:
>
> DefineShmemRegion("pg_stat_statements", sizeof(pgssSharedState), &pgss,
> &pgss,pgss_shmem_init, 0);
>
> you would call it like this:
>
> DefineShmemRegion(&(ShmemStructDesc) {
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> .init_fn = pgss_shmem_init,
> .flags = 0,
> });
>
> This flexibility will come handy as soon as we add the ability to resize.
I see your point. I'm not really convinced, though. In practice,
what's now going to happen is that you're probably going to move that
struct out of _PG_init() and define it elsewhere, so the logic is
getting split between multiple places, which does not improve
readability, and it also becomes much more worrying to what degree the
struct needs to be const, whereas if you just pass a bunch of
parameters by value you kind of understand what has to be happening.
Also, what flexibility have you really purchased? Sure, now you can
add arguments to the function call without breaking existing call
sites, but (1) there are other ways to do that, like by creating an
object first and then using methods to assign properties to it
afterward (i.e. RegisterCallbackOnShmemRegion) and (2) adding
arguments without breaking existing callers is not an unmixed blessing
in the first place. I know the world won't end if you go with this
style, but I guess I'm not much of a fan. I find this sort of thing
hard to read.
> Ok, we could add GetShmemRegion() in either design. Do we have any place
> where we'd use that though, instead the backend-private pointer global
> variable? I can't think of any examples where we currently call
> ShmemInitStruct() to get a pointer "on demand" like that.
>
> In pg_stat_statements, this would replace these tests:
>
> if (!pgss || !pgss_hash)
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> errmsg("pg_stat_statements must be loaded via
> \"shared_preload_libraries\"")));
>
> But I don't think pg_stat_statements could still allocate the region
> after postmaster startup. GetShmemRegion() would just be a different way
> of throwing that error.
In that case, yes. But something like autoprewarm only needs a small
amount of shared memory, and can potentially initialize itself on the
fly. The not-yet-committed pg_collect_advice module has similar needs,
which it currently satisfies using GetNamedDSMSegment(); see in
particular pg_collect_advice_attach() if you feel like wandering over
to the pg_plan_advice thread.
> Thanks, this hasn't changed my opinions, but I really appreciate
> pressure-testing the design. I don't want to rewrite this again in a
> year, because we didn't get it quite right.
Yeah, I'm somewhat concerned about an ever-proliferating number of
ways to do things that all sort of suck in different ways.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-24 15:32 Ashutosh Bapat <[email protected]>
parent: Robert Haas <[email protected]>
2 siblings, 2 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-24 15:32 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Mar 22, 2026 at 5:44 AM Heikki Linnakangas <[email protected]> wrote:
>
>
> I split this into more incremental patches. The first few are just
> refactorings that are probably useful on their own.
Here are some comments on the patches
0001
+
+# Enable pg_stat_statements to test restart of shared_preload_libraries.
+$node->append_conf(
+ 'postgresql.conf',
+ qq{shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.max = 50000
+compute_query_id = 'regress'
+});
+
In order to make sure that the shared memory and LWLocks for
pg_stat_statements are initialized after crash restart, we need to at
least query pg_stat_statements after the restart or do something so
that it's shared memory is used. Also, we can check whether the shared
memory structures are created by querying pg_shmem_allocations after
the restart.
0002
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
The new ShmemIndex initialization code is cleaner and more
straightforward. It avoids the recursive nature of ShmemInitHash.
However with this change it's hard to keep track of all the
initialization steps and their dependencies. Attached is a patch that
makes small adjustments to the code to make it more clear.
Use of variable hash_size is actually misleading since it's not the
size of the hash table but the expected/max number of entries in it.
Removing it makes code more readable.
0003, 0005, 0006 is straight forward, no comments. Usually these
patches make the code more readable. I will review it more when I see
the patches that use this refactoring.
0004 I traced back the placement of CreateLWLock in history. It feels
like the patch is moving it to its intended place since the function
was introduced. It needs a comment to explain why the function is not
called from CreateOrAttachShmemStructs where other in-memory
structures are allocated.
0007 without using new APIs is not necessarily a win. So I would
suggest committing it along with the refactoring patch.
If we are going to commit these patches separately, I would suggest
squashing all predicate.c patches into one commit.
I will continue from 0008 tomorrow.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] 0002_adjustments.patch.no_cibot (2.7K, 2-0002_adjustments.patch.no_cibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-25 16:05 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-25 16:05 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, Mar 24, 2026 at 9:02 PM Ashutosh Bapat
<[email protected]> wrote:
>
>
> I will continue from 0008 tomorrow.
>
I reviewed the documentation part of 0008. I have a few edits attached.
I have just one comment that's not covered in the edits
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
ShmemInitStruct() and ShmemInitHash() are still the functions to
allocate named structures. If we are going to keep ShmemInitStruct()
and ShmemInitHash() around for a while, I think it is more accurate to
mention them in this sentence along with the new functions.
Will continue reviewing the patch tomorrow.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] 0008_edits.patch.nocibot (2.7K, 2-0008_edits.patch.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-25 18:37 Robert Haas <[email protected]>
parent: Robert Haas <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Robert Haas @ 2026-03-25 18:37 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, Mar 21, 2026 at 8:14 PM Heikki Linnakangas <[email protected]> wrote:
> I see your point too, the options will indeed easily start to split
> between constant parts and parts that are set later.
Yeah. That definitely seems worth trying to fix.
> I don't like the idea of just using the string name as the handle. I
> think it will come handy to have a backend-private descriptor or handle
> to the shared memory region. But perhaps the "name" and "ptr" should be
> regular arguments, for example. If some parameters are passed as
> arguments and some in the struct, that also splits things in an awkward
> way, though.
I agree that having a handle object seems potentially useful. Putting
arguments that are required to be constant in the struct and passing
other things as arguments or in some other way seems like a good idea.
> I wonder if we should set aside a small amount of memory, like 10-20 kB,
> in the fixed shmem segment for small structs like that. Currently, such
> allocations will usually succeed because we leave some wiggle room, but
> the can also be consumed by other things like locks. But we could
> reserve a small amount of memory specifically for small allocations in
> extensions like this.
Yeah, I don't really understand why we let the lock table use up that
space. I mean, I think it would be useful to have a way to let the
lock table expand without a server restart, and I also suspect that we
could come up with a less-silly data structure than the PROCLOCK hash,
but also if the only thing keeping you from running out of lock space
is the wiggle room, maybe you just need to bump up
max_locks_per_transaction. Like, you could easily burn through the
wiggle room, get an error anyway, and then later find that you also
now can't load certain extensions without a server restart.
> Shmem callbacks
> ---------------
>
> I separated the request/init/fn callbacks from the structs. There's now
> a concept of "shmem callbacks", which you register in _PG_init(). For
> example:
>
> static void pgss_shmem_request(void *arg);
> static void pgss_shmem_init(void *arg);
>
> static const ShmemCallbacks pgss_shmem_callbacks = {
> .request_fn = pgss_shmem_request,
> .init_fn = pgss_shmem_init,
> .attach_fn = NULL, /* no special attach actions needed */
> };
What's the advantage of coupling the functions together this way, vs.
just registering each callback individually?
Also, I don't understand what "arg" is supposed to be doing. It
doesn't seem to be getting used for anything.
I wonder if we should think about just adjusting the timing of the
existing callbacks instead of adding new things. I mean, maybe that's
too likely to cause silent breakage, in which case we could also
replace them with new callbacks with slightly different names. But
there's something to be said for a hard cutover -- or at least not
letting both ways coexist for more than a few releases.
> This is a similar to the old shmem_request and shmem_startup hooks. It's
> not a one-to-one replacement though, the callbacks are called at
> different stages than the old hooks (to make things convenient with the
> new facility):
>
> * The request_fn callback is called in postmaster startup, at the same
> stage as the old shmem_request callback was. But in EXEC_BACKEND mode,
> it's *also* called in each backend.
>
> * The init_fn callback is only called in postmaster startup, when it's
> time to initialize the area. (Ignoring the special "after startup"
> allocations for now).
Trying to improve the timing of the callbacks makes sense to me as a
concept, without taking a position on these particular choices (and
definitely hoping for some after-startup improvements).
Looking at v7-0018 (any chance you have a public branch where I can
look at this without having to download and apply all the patches one
by one?) it seems like this gets rid of the need for a bunch of
IsUnderPostmaster checks in individual subsystems, and some if
(found)/if (!found) checks, which I definitely like.
> Shmem requests
> --------------
>
> To register a shmem area, you call ShmemRequestStruct() or
> ShmmeRequestHash() from the request callback function. For example:
>
> static void
> pgss_shmem_request(void *arg)
> {
> static ShmemHashDesc pgssSharedHashDesc = {
> .name = "pg_stat_statements hash",
> .ptr = &pgss_hash,
> .hash_info.keysize = sizeof(pgssHashKey),
> .hash_info.entrysize = sizeof(pgssEntry),
> .hash_flags = HASH_ELEM | HASH_BLOBS,
> };
> static ShmemStructDesc pgssSharedStateDesc = {
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> };
>
> pgssSharedHashDesc.init_size = pgss_max;
> pgssSharedHashDesc.max_size = pgss_max;
> ShmemRequestHash(&pgssSharedHashDesc);
> ShmemRequestStruct(&pgssSharedStateDesc);
> }
>
> Initialization happens in the init callback, which is called after all
> the pointers (pgss, pgss_hash) have been set.
I think this is not bad. I suppose it lets you get a complete list of
all of the descriptors someplace. It seems to avoid double-computing
the request size, or any possibility of drift between the requested
size and the allocated size. Having to create the struct and then set
the size afterward in some cases is a tiny bit awkward-looking, but
it's not awful. I do wonder if some coding pattern might creep in
where people create the struct and register it and then try to change
the size, or other fields, afterward. I'm tempted to propose making
the struct non-static and having the registration functions copy it,
so that there can be absolutely no question of getting away with such
antisocial behavior.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-26 10:10 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-26 10:10 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 24/03/2026 17:32, Ashutosh Bapat wrote:
> 0002
> void
> InitShmemAllocator(PGShmemHeader *seghdr)
> {
>
> The new ShmemIndex initialization code is cleaner and more
> straightforward. It avoids the recursive nature of ShmemInitHash.
> However with this change it's hard to keep track of all the
> initialization steps and their dependencies. Attached is a patch that
> makes small adjustments to the code to make it more clear.
>
> Use of variable hash_size is actually misleading since it's not the
> size of the hash table but the expected/max number of entries in it.
> Removing it makes code more readable.
Thanks, committed this 0002 patch with those changes, to get that out of
the way.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-26 18:31 Daniel Gustafsson <[email protected]>
parent: Robert Haas <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Daniel Gustafsson @ 2026-03-26 18:31 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
> On 22 Mar 2026, at 01:14, Heikki Linnakangas <[email protected]> wrote:
> Attachd is new version with lots of changes again. I've experimented with different ways that the interface could look like, like with the "adjust" callback we discussed earlier.
I had a look at this version today, mainly 0008 and onwards, and I quite like
the API. It is a bit verbose in places, but the improved readability outweighs
that IMHO.
> * The request_fn callback is called in postmaster startup, at the same stage as the old shmem_request callback was. But in EXEC_BACKEND mode, it's *also* called in each backend.
Should the request_fn be told, via an argument, from where it is called? It
can be figured out but it's cleaner if all implementations will do it in the
same way. I don't have a direct case in mind where it would be needed, but I
was recently digging into SSL passphrase reloading which has failure cases
precisely becasue of this so am thinking out loud to avoid similar problems
here.
> static void
> pgss_shmem_request(void *arg)
> {
> static ShmemHashDesc pgssSharedHashDesc = {
> .name = "pg_stat_statements hash",
> .ptr = &pgss_hash,
> .hash_info.keysize = sizeof(pgssHashKey),
> .hash_info.entrysize = sizeof(pgssEntry),
> .hash_flags = HASH_ELEM | HASH_BLOBS,
> };
> static ShmemStructDesc pgssSharedStateDesc = {
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> };
>
> pgssSharedHashDesc.init_size = pgss_max;
> pgssSharedHashDesc.max_size = pgss_max;
> ShmemRequestHash(&pgssSharedHashDesc);
> ShmemRequestStruct(&pgssSharedStateDesc);
> }
Roberts suggestion upthread to copy the structure to ensure that changing any
part of the struct after registration isn't causing subtle bugs seem like a
good improvement.
> I split this into more incremental patches. The first few are just refactorings that are probably useful on their own.
Reviewing these I wasn't able to spot any issues, but below are a few comments
on mostly 0008 but also a few others:
0008:
====
+ doesn'' immediately allocate or initialize the memory, it merely
s/doesn''/doesn't/
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. To any more complex
+ initialization, set the <function>init_fn()</function> callback, which
A word is missing here, perhaps: s/To any more complex/To perform any more complex/ ?
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
While not the fault of this patch, I wonder if directing readers to a 3000 line
C file which is growing increasingly complicated is all that helpful. Maybe we
should (as a separate piece of work) construct more direct examples/tutorials
for this?
+ shared memory available. The system reserves a somes memory for
s/a somes/some/
+ another backend. The callbacks will be held while holding an internal
+ lock, which prevents concurrent two backends from initializating the
"will be held" reads a bit odd, perhaps "will be called" or "will be executed"?
+ * Nowadays, there is also third way to allocate shared memory called Dynamic
"a third way"
+ * ShmemInitStruct()/ShmemInitHash() is another way of registring shmem areas.
s/registring/registering/
+/*
+ * # of additional entries to reserve in the shmem index table, for allocations
+ * after postmaster startup (not a hard limit)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
This comment no longer contains the word "estimated", so the "not a hard limit"
portion is harder to understand. Does mean one can change the define freely
and recompile without crashes due to wrong number, or can the hash grow
dynamically during runtime? Both interpretations are quite possible.
+ /*
+ * When we call the shmem_request callbacks, we enter the SB_REQUESTING
+ * phase. All ShmemRequestStruct calls happen in this state.
+ */
+ SB_REQUESTING,
Daft question, but what does the "B" stand for?
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
init_fn_arg seems quite useful bu is under-documented, perhaps add something to
xfunc.sgml would be worthwhile? The same can be said for request_fn_arg.
+ /* Check that it's not already registered in this process */
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemStructDesc *existing = (ShmemStructDesc *) lfirst(lc);
+
+ if (strcmp(existing->name, desc->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ desc->name)));
+ }
+
+ requested_shmem_areas = lappend(requested_shmem_areas, desc);
As a side-note, I wish we had a list_append_unique flavour which would invoke a
function pointer instead of just list_member() to avoid boilerplate like this.
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (!attach_allowed)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", desc->name);
I guess it depends a lot on the caller, but couldn't it be argued that this
case is a lot like !init_allowed and thus a FATAL?
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
It's actually called twice, once for ShmemGUCs as well.</nitpickery>
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, desc->size)));
When shmem_startup_state is set to SB_LATE_ATTACH_OR_INIT, would it be
worthwhile to add an errhint to move allocation to init phase instead?
Various
=====
In the 0014 commit message, s/ProgGlobal/ProcGlobal/
- if (IsUnderPostmaster && !process_shared_preload_libraries_in_progress)
+ if (shmem_startup_state == SB_DONE && IsUnderPostmaster)
This hunk in 0014 makes an assertion a few lines down pointless as it checks
the same as the if conditional.
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
Stray XXX in 0015.
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
I'm likely missing something obvious but is the LocalControlFile still needed?
--
Daniel Gustafsson
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-27 00:51 Heikki Linnakangas <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-27 00:51 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 25/03/2026 20:37, Robert Haas wrote:
> On Sat, Mar 21, 2026 at 8:14 PM Heikki Linnakangas <[email protected]> wrote:>> Shmem callbacks
>> ---------------
>>
>> I separated the request/init/fn callbacks from the structs. There's now
>> a concept of "shmem callbacks", which you register in _PG_init(). For
>> example:
>>
>> static void pgss_shmem_request(void *arg);
>> static void pgss_shmem_init(void *arg);
>>
>> static const ShmemCallbacks pgss_shmem_callbacks = {
>> .request_fn = pgss_shmem_request,
>> .init_fn = pgss_shmem_init,
>> .attach_fn = NULL, /* no special attach actions needed */
>> };
>
> What's the advantage of coupling the functions together this way, vs.
> just registering each callback individually?
One reason is to support allocations after postmaster startup. The
RegisterShmemCallbacks() call ties together all the resources requested
by the request_fn callback, with the the init_fn or attach_fn callbacks
that will later initialize/attach them. The init_fn/attach_fn callbacks
are called only after *all* the resources requested by the request_fn
callback have been initialized, and it holds a lock while doing all that.
If the callbacks were registered separately, shmem.c wouldn't know when
to call the init_fn/attach_fn. There's no problem during postmaster or
backend startup, because we run all init_fn or attach_fn callbacks in
the whole system, after requesting all the resources, but after startup,
you must only call the callbacks related to the newly-requested resources.
Aside from that after-startup allocation issue, though, IMO the
ShmemCallbacks struct makes it more clear that the callbacks are meant
to work together on the same resources.
One way to think of this is that all the resources requested by the
request_fn callback are implicitly part of the same "subsystem", and
need to be initialized/attached to together. We discussed that before,
and I still wonder if we should make that concept of a subsystem more
explicit. If we just renamed ShmemCallbacks to ShmemSubsystem, and give
each subsystem a name, it'd look like this:
static void pgss_shmem_request(void *arg);
static void pgss_shmem_init(void *arg);
static const ShmemSubsystem pgss_shmem_subsystem = {
.name = "pg_stat_statements"
.request_fn = pgss_shmem_request,
.init_fn = pgss_shmem_init,
.attach_fn = NULL, /* no special attach actions needed */
};
static void
pgss_shmem_request(void *arg)
{
ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
/*
* name is optional in this design, subsystem's name is used if
* not given
*/
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
});
}
static void
pgss_shmem_init(void *arg)
{
/* initialize contents of pgss */
...
}
void
_PG_init(void)
{
RegisterShmemSubsystem(&pgss_shmem_subsystem);
}
Thinking how this might work without such a struct, registering the
callbacks separately, here's an alternative design:
static void pgss_shmem_request(void *arg);
static void pgss_shmem_init(void *arg);
static void
pgss_shmem_request(void *arg)
{
ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
});
ShmemRegisterInitCallback(&pgss_shmem_init);
/* no attach callback needed, but for illustration: */
ShmemRegisterInitCallback(&pgss_shmem_attach);
}
static void
pgss_shmem_init(void *arg)
{
/* initialize contents of pgss */
...
}
void
_PG_init(void)
{
ShmemRegisterRequestCallback(&pgss_shmem_request);
}
In this design, the ShmemRegisterRequestCallback() call still ties
together all the related resources. All the resources requested in the
request-callback are initialized together, and the fact that the
init/attach callbacks are registered within the request callback
associates them with the resources. This feels a little Rube
Goldbergian, with one callback registering more callbacks, but would
also work.
> Also, I don't understand what "arg" is supposed to be doing. It
> doesn't seem to be getting used for anything.
It's an opaque pointer that's passed through to the callbacks. Some
callers might want to pass extra data to the callback. None of the
current callers use it, so maybe it's not needed, but it seemed like
good future-proofing.
>> Shmem requests
>> --------------
>>
>> To register a shmem area, you call ShmemRequestStruct() or
>> ShmmeRequestHash() from the request callback function. For example:
>>
>> static void
>> pgss_shmem_request(void *arg)
>> {
>> static ShmemHashDesc pgssSharedHashDesc = {
>> .name = "pg_stat_statements hash",
>> .ptr = &pgss_hash,
>> .hash_info.keysize = sizeof(pgssHashKey),
>> .hash_info.entrysize = sizeof(pgssEntry),
>> .hash_flags = HASH_ELEM | HASH_BLOBS,
>> };
>> static ShmemStructDesc pgssSharedStateDesc = {
>> .name = "pg_stat_statements",
>> .size = sizeof(pgssSharedState),
>> .ptr = (void **) &pgss,
>> };
>>
>> pgssSharedHashDesc.init_size = pgss_max;
>> pgssSharedHashDesc.max_size = pgss_max;
>> ShmemRequestHash(&pgssSharedHashDesc);
>> ShmemRequestStruct(&pgssSharedStateDesc);
>> }
>>
>> Initialization happens in the init callback, which is called after all
>> the pointers (pgss, pgss_hash) have been set.
>
> I think this is not bad. I suppose it lets you get a complete list of
> all of the descriptors someplace. It seems to avoid double-computing
> the request size, or any possibility of drift between the requested
> size and the allocated size. Having to create the struct and then set
> the size afterward in some cases is a tiny bit awkward-looking, but
> it's not awful. I do wonder if some coding pattern might creep in
> where people create the struct and register it and then try to change
> the size, or other fields, afterward. I'm tempted to propose making
> the struct non-static and having the registration functions copy it,
> so that there can be absolutely no question of getting away with such
> antisocial behavior.
Here's another version, where that now looks like this:
static void
pgss_shmem_request(void *arg)
{
static ShmemHashDesc pgssSharedHashDesc;
static ShmemStructDesc pgssSharedStateDesc;
ShmemRequestHash(&pgssSharedHashDesc, &(ShmemRequestHashOpts) {
.name = "pg_stat_statements hash",
.ptr = &pgss_hash,
.init_size = pgss_max,
.max_size = pgss_max,
.hash_info.keysize = sizeof(pgssHashKey),
.hash_info.entrysize = sizeof(pgssEntry),
.hash_flags = HASH_ELEM | HASH_BLOBS,
});
ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
});
}
Notable differences to that since last version:
I separated the backend-private "handle" and the options structs. So
ShmemRequestStruct/Hash now takes two arguments:
The first argument is a pointer to the "descriptor", which is a
backend-private handle for the shared memory area. It's currently not
very interesting for plain structs because you can't really do anything
with it, but in the future the handle could be used to resize the area
for example. Also, one of the later patches in this patch set refactors
SLRUs to be requested in this fashion too. For SLRUs the handle is
already useful; SLRUs have always had a backend-private "SlruCtl" handle
like that.
The second argument is an "options" struct, which is really just
syntactic sugar to pass multiple arguments, some of which can be left
empty. The options are now copied, so the struct can be short-lived.
Alternatively, the arguments could be passed as normal function
arguments like you suggested earlier. But I quite like this style,
especially with hash tables and SLRUs which take more options and
already used structs to pass the options.
There's one annoying problem with that syntax though: pgindent doesn't
like it and gives errors like this:
Failure in contrib/pg_stat_statements/pg_stat_statements.c: Error@508:
Unbalanced parens
Warning@516: Extra )
Error@517: Unbalanced parens
Warning@521: Extra )
That would be nice to fix in pgindent in any case but I haven't looked
into it yet.
Another idea is to use a macro to hide that from pgindent, which would
make the calls little less verbose anyway:
#define ShmemRequestStruct(desc, ...) ShmemRequestStructWithOpts(desc,
&(ShmemRequestStructOpts) { __VA_ARGS__ })
Then the call would be simply:
ShmemRequestStruct(&pgssSharedStateDesc,
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
);
New version attached. I also pushed this to
https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-8.
- Heikki
Attachments:
[text/x-patch] v8-0001-Test-pg_stat_statements-across-crash-restart.patch (2.6K, 2-v8-0001-Test-pg_stat_statements-across-crash-restart.patch)
download | inline diff:
From 4b906f22bcef3165086eec04e38459dd6a549353 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 01:45:16 +0200
Subject: [PATCH v8 01/16] Test pg_stat_statements across crash restart
Add 'pg_stat_statements' to the crash restart test, to test that
shared memory and LWLock initialization works across crash restart in
a library listed in shared_preload_libraries.
Reviewed-by: Ashutosh Bapat <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/test/recovery/t/013_crash_restart.pl | 33 +++++++++++++++++++++---
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index 20d648ad6af..56afb1aa6eb 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -21,14 +21,32 @@ my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
my $node = PostgreSQL::Test::Cluster->new('primary');
$node->init(allows_streaming => 1);
+
+# Enable pg_stat_statements to test restart of shared_preload_libraries.
+$node->append_conf(
+ 'postgresql.conf',
+ qq{shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.max = 50000
+compute_query_id = 'regress'
+});
+
$node->start();
# by default PostgreSQL::Test::Cluster doesn't restart after a crash
$node->safe_psql(
- 'postgres',
- q[ALTER SYSTEM SET restart_after_crash = 1;
- ALTER SYSTEM SET log_connections = receipt;
- SELECT pg_reload_conf();]);
+ 'postgres', q[
+ ALTER SYSTEM SET restart_after_crash = 1;
+ ALTER SYSTEM SET log_connections = receipt;
+ SELECT pg_reload_conf();
+ ]);
+
+# Remember the time that pg_stat_statements was reset. We'll use it later to
+# verify that it gets re-initialized after crash.
+my $stats_reset = $node->safe_psql(
+ 'postgres', q[
+ CREATE EXTENSION pg_stat_statements;
+ SELECT stats_reset FROM pg_stat_statements_info;
+ ]);
# Run psql, keeping session alive, so we have an alive backend to kill.
my ($killme_stdin, $killme_stdout, $killme_stderr) = ('', '', '');
@@ -141,6 +159,13 @@ $killme->run();
($monitor_stdin, $monitor_stdout, $monitor_stderr) = ('', '', '');
$monitor->run();
+# Verify that pg_stat_statements, loaded via shared_preload_libraries,
+# was re-initialized at the crash.
+my $stats_reset_after = $node->safe_psql('postgres',
+ q[SELECT stats_reset FROM pg_stat_statements_info]);
+cmp_ok($stats_reset, 'ne', $stats_reset_after,
+ "pg_stat_statements was reset by restart");
+
# Acquire pid of new backend
$killme_stdin .= q[
--
2.47.3
[text/x-patch] v8-0002-refactor-Move-ShmemInitHash-to-separate-file.patch (7.3K, 3-v8-0002-refactor-Move-ShmemInitHash-to-separate-file.patch)
download | inline diff:
From 5bdc4db83f5fec8a65f101d0c0aa0d8f4ce1a32d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH v8 02/16] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 66 ---------------------
src/backend/storage/ipc/shmem_hash.c | 85 ++++++++++++++++++++++++++++
4 files changed, 87 insertions(+), 66 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cc38450949..c86d691dcfb 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -294,72 +294,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * max_size is the estimated maximum number of hashtable entries. This is
- * not a hard limit, but the access efficiency will degrade if it is
- * exceeded substantially (since it's used to compute directory size and
- * the hash table buckets will get overfull).
- *
- * init_size is the number of hashtable entries to preallocate. For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 init_size, /* initial table size */
- int64 max_size, /* max size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- void *location;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory; it can't
- * grow or other backends wouldn't be able to find it. So, make sure we
- * make it big enough to start with.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemAllocNoError;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
-
- /* look it up in the shmem index */
- location = ShmemInitStruct(name,
- hash_get_shared_size(infoP, hash_flags),
- &found);
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (found)
- hash_flags |= HASH_ATTACH;
-
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
-
- return hash_create(name, init_size, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..b0c8d5939a0
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,85 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * XXX
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ void *location;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
+ infoP->alloc = ShmemAllocNoError;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* look it up in the shmem index */
+ location = ShmemInitStruct(name,
+ hash_get_shared_size(infoP, hash_flags),
+ &found);
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (found)
+ hash_flags |= HASH_ATTACH;
+
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+
+ return hash_create(name, init_size, infoP, hash_flags);
+}
--
2.47.3
[text/x-patch] v8-0003-refactor-predicate.c-inline-SerialInit-to-the-cal.patch (3.6K, 4-v8-0003-refactor-predicate.c-inline-SerialInit-to-the-cal.patch)
download | inline diff:
From 110197f7e862ee7fd88573c968d4881e1fa3f111 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH v8 03/16] refactor predicate.c: inline SerialInit to the
caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index edabbf4ca31..27682f416e7 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1358,7 +1315,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.47.3
[text/x-patch] v8-0004-refactor-predicate.c-Use-separate-variables-for-d.patch (6.7K, 5-v8-0004-refactor-predicate.c-Use-separate-variables-for-d.patch)
download | inline diff:
From bf42b34a31fd1b3d79ce7a6e6390c284fa2789d8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 15:29:53 +0200
Subject: [PATCH v8 04/16] refactor predicate.c: Use separate variables for
different sizes for clarity
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 68 ++++++++++++++--------------
1 file changed, 35 insertions(+), 33 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 27682f416e7..40950ee3a4f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1113,7 +1113,10 @@ void
PredicateLockShmemInit(void)
{
HASHCTL info;
- int64 max_table_size;
+ int64 max_predicate_lock_targets;
+ int64 max_predicate_locks;
+ int64 max_serializable_xacts;
+ int64 max_rw_conflicts;
Size requestSize;
bool found;
@@ -1125,7 +1128,7 @@ PredicateLockShmemInit(void)
* Compute size of predicate lock target hashtable. Note these
* calculations must agree with PredicateLockShmemSize!
*/
- max_table_size = NPREDICATELOCKTARGETENTS();
+ max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
* Allocate hash table for PREDICATELOCKTARGET structs. This stores
@@ -1136,8 +1139,8 @@ PredicateLockShmemInit(void)
info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_table_size,
- max_table_size,
+ max_predicate_lock_targets,
+ max_predicate_lock_targets,
&info,
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
@@ -1169,11 +1172,11 @@ PredicateLockShmemInit(void)
info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
- max_table_size *= 2;
+ max_predicate_locks = max_predicate_lock_targets * 2;
PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_table_size,
- max_table_size,
+ max_predicate_locks,
+ max_predicate_locks,
&info,
HASH_ELEM | HASH_FUNCTION |
HASH_PARTITION | HASH_FIXED_SIZE);
@@ -1181,23 +1184,20 @@ PredicateLockShmemInit(void)
/*
* Compute size for serializable transaction hashtable. Note these
* calculations must agree with PredicateLockShmemSize!
- */
- max_table_size = (MaxBackends + max_prepared_xacts);
-
- /*
- * Allocate a list to hold information on transactions participating in
- * predicate locking.
*
* Assume an average of 10 predicate locking transactions per backend.
* This allows aggressive cleanup while detail is present before data must
* be summarized for storage in SLRU and the "dummy" transaction.
*/
- max_table_size *= 10;
+ max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
+ /*
+ * Allocate a list to hold information on transactions participating in
+ * predicate locking.
+ */
requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_table_size,
+ (mul_size((Size) max_serializable_xacts,
sizeof(SERIALIZABLEXACT))));
-
PredXact = ShmemInitStruct("PredXactList",
requestSize,
&found);
@@ -1220,7 +1220,7 @@ PredicateLockShmemInit(void)
PredXact->element
= (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
/* Add all elements to available list, clean. */
- for (i = 0; i < max_table_size; i++)
+ for (i = 0; i < max_serializable_xacts; i++)
{
LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
LWTRANCHE_PER_XACT_PREDICATE_LIST);
@@ -1254,8 +1254,8 @@ PredicateLockShmemInit(void)
info.entrysize = sizeof(SERIALIZABLEXID);
SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_table_size,
- max_table_size,
+ max_serializable_xacts,
+ max_serializable_xacts,
&info,
HASH_ELEM | HASH_BLOBS |
HASH_FIXED_SIZE);
@@ -1271,10 +1271,10 @@ PredicateLockShmemInit(void)
* occasional transactions canceled when trying to flag conflicts. That's
* probably OK.
*/
- max_table_size *= 5;
+ max_rw_conflicts = max_serializable_xacts * 5;
requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_table_size,
+ mul_size((Size) max_rw_conflicts,
RWConflictDataSize);
RWConflictPool = ShmemInitStruct("RWConflictPool",
@@ -1292,7 +1292,7 @@ PredicateLockShmemInit(void)
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
/* Add all elements to available list, clean. */
- for (i = 0; i < max_table_size; i++)
+ for (i = 0; i < max_rw_conflicts; i++)
{
dlist_push_tail(&RWConflictPool->availableList,
&RWConflictPool->element[i].outLink);
@@ -1353,16 +1353,19 @@ Size
PredicateLockShmemSize(void)
{
Size size = 0;
- long max_table_size;
+ int64 max_predicate_lock_targets;
+ int64 max_predicate_locks;
+ int64 max_serializable_xacts;
+ int64 max_rw_conflicts;
/* predicate lock target hash table */
- max_table_size = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_table_size,
+ max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
+ size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
sizeof(PREDICATELOCKTARGET)));
/* predicate lock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size,
+ max_predicate_locks = max_predicate_lock_targets * 2;
+ size = add_size(size, hash_estimate_size(max_predicate_locks,
sizeof(PREDICATELOCK)));
/*
@@ -1372,20 +1375,19 @@ PredicateLockShmemSize(void)
size = add_size(size, size / 10);
/* transaction list */
- max_table_size = MaxBackends + max_prepared_xacts;
- max_table_size *= 10;
+ max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_table_size,
+ size = add_size(size, mul_size((Size) max_serializable_xacts,
sizeof(SERIALIZABLEXACT)));
/* transaction xid table */
- size = add_size(size, hash_estimate_size(max_table_size,
+ size = add_size(size, hash_estimate_size(max_serializable_xacts,
sizeof(SERIALIZABLEXID)));
/* rw-conflict pool */
- max_table_size *= 5;
+ max_rw_conflicts = max_serializable_xacts * 5;
size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_table_size,
+ size = add_size(size, mul_size((Size) max_rw_conflicts,
RWConflictDataSize));
/* Head for list of finished serializable transactions. */
--
2.47.3
[text/x-patch] v8-0005-refactor-predicate.c-Move-all-the-initialization-.patch (8.3K, 6-v8-0005-refactor-predicate.c-Move-all-the-initialization-.patch)
download | inline diff:
From 98c04053eacc1a0e0695efaf51d03264619e7dea Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH v8 05/16] refactor predicate.c: Move all the initialization
together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 40950ee3a4f..4f80fc73639 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1145,19 +1145,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1202,49 +1189,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1281,23 +1225,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1308,8 +1235,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1331,19 +1256,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.47.3
[text/x-patch] v8-0006-Introduce-a-new-mechanism-for-registering-shared-.patch (59.3K, 7-v8-0006-Introduce-a-new-mechanism-for-registering-shared-.patch)
download | inline diff:
From 3701db9df2d08a3995974d7447edb1ca59bff37f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 02:32:36 +0200
Subject: [PATCH v8 06/16] Introduce a new mechanism for registering shared
memory areas
Each shared memory area is registered with a "descriptor struct" that
contains parameters like name and size of the area. The descriptor
struct makes it easier to add optional fields in the future; the
additional fields can just be left as zeros.
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register the subsystem's shared memory needs. The
registration includes the size, which replaces the
[Subsystem]ShmemSize() calls, and a pointer to an initialization
callback function, which replaces the [Subsystem]ShmemInit()
calls. This is more ergonomic, as you only need to calculate the size
once, when you register the struct.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 161 +++--
src/backend/bootstrap/bootstrap.c | 1 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 18 +-
src/backend/storage/ipc/ipci.c | 33 +-
src/backend/storage/ipc/shmem.c | 770 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 92 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 9 +-
src/backend/utils/hash/dynahash.c | 2 +
src/include/storage/shmem.h | 210 ++++++-
src/test/modules/test_aio/test_aio.c | 1 -
src/tools/pgindent/typedefs.list | 9 +-
14 files changed, 1120 insertions(+), 197 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..1e9ff801a7c 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,131 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in its
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .shmem_request_fn = my_shmem_request,
+ .shmem_init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ });
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_ALLOW_AFTER_STARTUP</literal> flag in the
+ descriptor to acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3771,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 68a42de0889..86fe86354f5 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -370,6 +370,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..75423104be8 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +673,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3fac46c402b..e81ef248bf1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -959,7 +959,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3235,7 +3242,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterShmemStructs() here, we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index d692d419846..493ddd7f12f 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -99,8 +99,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -174,6 +175,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -218,7 +226,21 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -249,11 +271,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c86d691dcfb..84099ce78fe 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,48 +19,116 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size, but
- * their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted. However,
- * if one hash table grows very large and then shrinks, its space
- * cannot be redistributed to other tables. We could build a simple
- * hash bucket garbage collector if need be. Right now, it seems
- * unnecessary.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified in the
+ * descriptor when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the register_fn
+ * callback, fill in a ShmemStructDesc descriptor with the name, size, and any
+ * other options, and call ShmemRequestStruct(). Leave any unused fields as
+ * zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the shmem_request callbacks are called.
+ * Only after all the request callbacks have been called and all the shmem
+ * areas have been requested by the ShmemRequestStruct() calls we know how
+ * much shared memory we need in total. After that, postmaster allocates
+ * global shared memory segment, and calls all the init_fn callbacks to
+ * initialize all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -79,6 +147,74 @@
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with
+ * ShmemRequestInternal() are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructDesc *desc;
+ ShmemRequestStructOpts *options;
+ ShmemAreaKind kind;
+} ShmemRequest;
+
+static List *requested_shmem_areas;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> LATE_ATTACH_OR_INIT -> DONE
+ */
+static enum
+{
+ /* Initial state */
+ SB_INITIAL,
+
+ /*
+ * When we call the shmem_request callbacks, we enter the SB_REQUESTING
+ * phase. All ShmemRequestStruct calls happen in this state.
+ */
+ SB_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. We are now calling the init_fn callbacks.
+ */
+ SB_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. The attach_fn callbacks are
+ * called in this state.
+ */
+ SB_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SB_LATE_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SB_DONE,
+} shmem_startup_state;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -109,25 +245,373 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static bool AttachOrInit(ShmemRequest *request, bool init_allowed, bool attach_allowed);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
+/*
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. In postmaster, the the
+ * shmem_request callback is called during startup, but in EXEC_BACKEND mode,
+ * it is also called in each backend at backend startup. By calling the same
+ * function in both cases, we ensure that all the shmem areas are registered
+ * the same way in all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStruct(ShmemStructDesc *desc, const ShmemRequestStructOpts *options)
+{
+ ShmemRequestStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemRequestStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemRequestStructOpts));
+
+ ShmemRequestInternal(desc, options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructDesc *desc, ShmemRequestStructOpts *options,
+ ShmemAreaKind kind)
+{
+ ListCell *lc;
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_REQUEST_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_REQUEST_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_REQUEST_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_startup_state != SB_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemStructDesc *existing = (ShmemStructDesc *) lfirst(lc);
+
+ if (strcmp(existing->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->desc = desc;
+ request->kind = kind;
+ requested_shmem_areas = lappend(requested_shmem_areas, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ ListCell *lc;
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(requested_shmem_areas) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the requested areas */
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemRequest *request = (ShmemRequest *) lfirst(lc);
+
+ size = add_size(size, request->options->size);
+ size = add_size(size, request->options->extra_size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ ListCell *lc;
+
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_startup_state == SB_INITIALIZING);
+
+ /*
+ * Initialize all the requested memory areas. There are no concurrent
+ * processes yet, so no need for locking.
+ */
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemRequest *request = (ShmemRequest *) lfirst(lc);
+
+ AttachOrInit(request, true, false);
+ }
+ list_free_deep(requested_shmem_areas);
+ requested_shmem_areas = NIL;
+
+ /* Call init callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_startup_state = SB_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_startup_state == SB_REQUESTING);
+ shmem_startup_state = SB_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemRequest *request = (ShmemRequest *) lfirst(lc);
+
+ AttachOrInit(request, false, true);
+ }
+ list_free(requested_shmem_areas);
+ requested_shmem_areas = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_startup_state = SB_DONE;
+}
+#endif
+
+/*
+ * Workhorse to insert or look up a named shmem area in the shared memory
+ * index, and initialize or attach to it.
+ *
+ * If !init_allowed and the entry is not found, throws an error. If
+ * !attach_allowed and the entry is found, throws an error.
+ */
+static bool
+AttachOrInit(ShmemRequest *request, bool init_allowed, bool attach_allowed)
+{
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ ShmemStructDesc *desc = request->desc;
+ ShmemIndexEnt *index_entry;
+ bool found;
+
+ desc->name = request->options->name;
+ desc->ptr = NULL;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, request->options->name,
+ init_allowed ? HASH_ENTER_NULL : HASH_FIND, &found);
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (!attach_allowed)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", desc->name);
+
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_REQUEST_UNKNOWN_SIZE)
+ {
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ }
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(desc, request->options);
+ break;
+ }
+ }
+ else if (!init_allowed)
+ {
+ /* attach was requested, but it was not found */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else
+ {
+ /*
+ * We inserted the entry to the shared memory index. Allocate
+ * requested amount of shared memory for it, and do basic
+ * initializion.
+ */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(desc, request->options);
+ break;
+ }
+ }
+
+ return found;
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_size;
HASHCTL info;
int hash_flags;
size_t size;
@@ -137,6 +621,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_startup_state == SB_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_startup_state == SB_REQUESTING);
+ shmem_startup_state = SB_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -181,9 +675,11 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_size = list_length(requested_shmem_areas) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
- info.dsize = info.max_dsize = hash_select_dirsize(SHMEM_INDEX_SIZE);
+ info.dsize = info.max_dsize = hash_select_dirsize(hash_size);
info.alloc = ShmemAllocNoError;
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
if (!IsUnderPostmaster)
@@ -194,10 +690,26 @@ InitShmemAllocator(PGShmemHeader *seghdr)
else
hash_flags |= HASH_ATTACH;
info.hctl = ShmemAllocator->index;
- ShmemIndex = hash_create("ShmemIndex", SHMEM_INDEX_SIZE, &info, hash_flags);
+ ShmemIndex = hash_create("ShmemIndex", hash_size, &info, hash_flags);
Assert(ShmemIndex != NULL);
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_startup_state = SB_INITIAL;
+ requested_shmem_areas = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -295,92 +807,128 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
- *
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_ALLOW_AFTER_STARTUP is set, this can also be used after startup,
+ * although after startup there's no guarantee that there's enough shared
+ * memory available. When called after startup, this immediately calls the
+ * right callbacks depending on whether another backend had already
+ * initialized the area.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_startup_state == SB_DONE && IsUnderPostmaster)
+ {
+ /* After-startup initialization */
+ ListCell *lc;
+ bool found = false;
- Assert(ShmemIndex != NULL);
+ if ((callbacks->flags & SHMEM_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ Assert(requested_shmem_areas == NIL);
+ Assert(shmem_startup_state == SB_DONE);
+ shmem_startup_state = SB_REQUESTING;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_startup_state = SB_LATE_ATTACH_OR_INIT;
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
+ foreach(lc, requested_shmem_areas)
+ {
+ ShmemRequest *request = (ShmemRequest *) lfirst(lc);
+
+ found = AttachOrInit(request, true, true);
+ }
- if (*foundPtr)
- {
/*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
+ * FIXME: What to do if multiple shmem areas were requested, and some
+ * of them are already initialized but not all? We expect all shmem
+ * areas requested by a single callback to form a coherent unit.
*/
- if (result->size != size)
+ if (found)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
}
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
-
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
+ else
{
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+
+ LWLockRelease(ShmemIndexLock);
+ shmem_startup_state = SB_DONE;
+ return;
}
- LWLockRelease(ShmemIndexLock);
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+}
+
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(ShmemAddrIsValid(structPtr));
+ Assert(shmem_startup_state == SB_INITIAL);
+ shmem_startup_state = SB_REQUESTING;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ ShmemStructDesc desc;
+ ShmemRequestStructOpts options = {
+ .name = name,
+ .size = size,
+ };
+ ShmemRequest request = {&desc, &options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_startup_state == SB_DONE ||
+ shmem_startup_state == SB_INITIALIZING ||
+ shmem_startup_state == SB_REQUESTING);
+
+ /* look it up immediately */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ *foundPtr = AttachOrInit(&request, true, true);
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(desc.ptr != NULL);
+ return desc.ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index b0c8d5939a0..48bb5d97c1a 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -1,11 +1,17 @@
/*-------------------------------------------------------------------------
*
* shmem_hash.c
- * XXX
+ * hash table implementation in shared memory
*
* Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it, specified in its descriptor when its
+ * requested.
*
* IDENTIFICATION
* src/backend/storage/ipc/shmem_hash.c
@@ -16,6 +22,85 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "utils/memutils.h"
+
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHash(ShmemHashDesc *desc, const ShmemRequestHashOpts *options)
+{
+ ShmemRequestHashOpts *options_copy;
+ int64 dirsize;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemRequestHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemRequestHashOpts));
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ dirsize = hash_select_dirsize(options->max_size);
+ options_copy->hash_info.dsize = dirsize;
+ options_copy->hash_info.max_dsize = dirsize;
+ options_copy->hash_info.alloc = ShmemAllocNoError;
+ options_copy->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /*
+ * Create a struct descriptor for the fixed-size area holding the hash
+ * table
+ */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_get_shared_size(&options_copy->hash_info,
+ options_copy->hash_flags);
+
+ /* Reserve extra space for the buckets */
+ options_copy->base.extra_size =
+ hash_estimate_size(options->max_size, options_copy->hash_info.entrysize) - options_copy->base.size;
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemRequestHashOpts *options = (ShmemRequestHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = hash_create(desc->base.name, options->init_size, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
+
+void
+shmem_hash_attach(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemRequestHashOpts *options = (ShmemRequestHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = hash_create(desc->base.name, options->init_size, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
/*
@@ -41,9 +126,8 @@
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..278d2f20376 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4164,7 +4164,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 11f5778eba1..b40e14abace 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -396,6 +396,8 @@ hash_create(const char *tabname, int64 nelem, const HASHCTL *info, int flags)
}
/* Initialize the hash header, plus a copy of the table name */
+ Assert(tabname != NULL);
+ Assert(CurrentDynaHashCxt != NULL);
hashp = (HTAB *) MemoryContextAlloc(CurrentDynaHashCxt,
sizeof(HTAB) + strlen(tabname) + 1);
MemSet(hashp, 0, sizeof(HTAB));
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 2a9e9becd26..a07fccfb1ba 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,17 +24,212 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemAreaKind;
+
+/*
+ * ShmemStructDesc describes a named area or struct in shared memory.
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name and size of the shared memory area. */
+ const char *name;
+
+ void *ptr;
+ size_t size;
+} ShmemStructDesc;
+
+#define SHMEM_REQUEST_UNKNOWN_SIZE (-1)
+
+typedef struct ShmemRequestStructOpts
+{
+ const char *name;
+
+ ssize_t size;
+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemRequestStructOpts;
+
+typedef struct ShmemHashDesc
+{
+ ShmemStructDesc base;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB *ptr;
+} ShmemHashDesc;
+
+/*
+ * Descriptor for a named shared memory hash table.
+ *
+ * Similar to ShmemStructDesc, but describes a shared memory hash table. Each
+ * hash table is backed by an allocated area, described by 'base_desc', but if
+ * 'max_size' is greater than 'init_size', it can also grow beyond the initial
+ * allocated area by allocating more hash entries from the global unreserved
+ * space.
+ */
+typedef struct ShmemRequestHashOpts
+{
+ ShmemRequestStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemRequestHashOpts;
+
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Allow these shared memory allocations after postmaster startup. Normally,
+ * RegisterShmemCallbacks() errors out if it's called after postmaster startup
+ * e.g. in an add-in library loaded on-demaind in a backend. If you set this
+ * flag, RegisterShmemCallbacks() will instead immediately call the callbacks,
+ * to initialize or attach to the requested shared memory areas.
+ *
+ * This is not used by any built-in subsystems, but extensions can find it
+ * useful.
+ */
+#define SHMEM_ALLOW_AFTER_STARTUP 0x00000001
/* shmem.c */
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
+
+extern void ShmemRequestInternal(ShmemStructDesc *desc, ShmemRequestStructOpts *options,
+ ShmemAreaKind kind);
+
+extern void ShmemRequestStruct(ShmemStructDesc *desc, const ShmemRequestStructOpts *options);
+extern void ShmemRequestHash(ShmemHashDesc *desc, const ShmemRequestHashOpts *options);
+
+extern void ShmemCallRequestCallbacks(void);
+
+/* legacy shmem allocation functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
+extern void shmem_hash_init(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *options);
+extern void shmem_hash_attach(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *options);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
@@ -43,19 +238,4 @@ extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* estimated size of the shmem index table (not a hard limit) */
-#define SHMEM_INDEX_SIZE (64)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index b1aa8af9ec0..d687408af0c 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -50,7 +50,6 @@ static InjIoErrorState *inj_io_error_state;
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-
static PgAioHandle *last_handle;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 712d84128ca..d8d2548ef2d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2848,9 +2848,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemAreaKind
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemRequest
+ShmemRequestHashOpts
+ShmemRequestStructOpts
+ShmemStructDesc
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v8-0007-Add-test-module-to-test-after-startup-shmem-alloc.patch (10.0K, 8-v8-0007-Add-test-module-to-test-after-startup-shmem-alloc.patch)
download | inline diff:
From 351c752b3657c64cab938bf4f8a1fc3b908fcda9 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:04:00 +0200
Subject: [PATCH v8 07/16] Add test module to test after-startup shmem
allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 ++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 43 +++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 107 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 222 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..62fab9f3c2f 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -47,6 +47,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..6799ba11e11 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -48,6 +48,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..84ec841b542
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,43 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+ok($attach_count1 == 0 && $attach_count2 == 0 ||
+ $attach_count2 >= $attach_count1,
+ "loaded via shared_preload_libraries");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..1d7f31b37c7
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,107 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * XXX This module provides interface functions for C functionality to SQL, to
+ * make it possible to test AIO related behavior in a targeted way from SQL.
+ * It'd not generally be safe to export these functions to SQL, but for a test
+ * that's fine.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/relation.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ static ShmemStructDesc TestShmemDesc;
+
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(&TestShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem,
+ });
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d8d2548ef2d..139cb6f9da5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3129,6 +3129,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.47.3
[text/x-patch] v8-0008-Convert-pg_stat_statements-to-use-the-new-interfa.patch (11.2K, 9-v8-0008-Convert-pg_stat_statements-to-use-the-new-interfa.patch)
download | inline diff:
From ac38579f317287bb0f603c3de08081d6099de66c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 13 Mar 2026 23:00:00 +0200
Subject: [PATCH v8 08/16] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 177 ++++++++----------
1 file changed, 79 insertions(+), 98 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6cb14824ec3..d9b1b8b4fe9 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,48 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ static ShmemHashDesc pgssSharedHashDesc;
+ static ShmemStructDesc pgssSharedStateDesc;
+
+ ShmemRequestHash(&pgssSharedHashDesc, &(ShmemRequestHashOpts) {
+ .name = "pg_stat_statements hash",
+ .ptr = &pgss_hash,
+ .init_size = pgss_max,
+ .max_size = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ });
+ ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ });
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +550,32 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * Initialize the shmem area with no statistics.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
- LWLockRelease(AddinShmemInitLock);
+ /* The hash table must be initialized already */
+ Assert(pgss_hash != NULL);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
-
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1338,7 +1333,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1359,11 +1354,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1378,8 +1373,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1518,7 +1513,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1807,7 +1802,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2044,7 +2039,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
free(qbuffer);
}
@@ -2083,20 +2078,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2726,7 +2707,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2820,7 +2801,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v8-0009-Introduce-registry-of-built-in-subsystems.patch (6.1K, 10-v8-0009-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From 77dbf3df8c56cc760fa2d4661f35413b39b8ddb8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:42:11 +0200
Subject: [PATCH v8 09/16] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 19 ++++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 ++++++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++++++
7 files changed, 83 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 86fe86354f5..69e90fef4f9 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -359,6 +359,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e81ef248bf1..ae9e1a94b03 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -929,6 +929,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 493ddd7f12f..a2ef290b5b3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -253,6 +254,24 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ const ShmemCallbacks *builtin_subsystems[] = {
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) &subsystem_callbacks,
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+ };
+
+ for (int i = 0; i < lengthof(builtin_subsystems); i++)
+ RegisterShmemCallbacks(builtin_subsystems[i]);
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 278d2f20376..8320e478d08 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4146,6 +4146,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
--
2.47.3
[text/x-patch] v8-0010-Convert-injection-points-to-use-the-new-interface.patch (4.6K, 11-v8-0010-Convert-injection-points-to-use-the-new-interface.patch)
download | inline diff:
From 94cdde8a4211f26bbc5be2639f49d986551aa189 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH v8 10/16] Convert injection points to use the new interface
---
src/backend/storage/ipc/ipci.c | 3 --
src/backend/utils/misc/injection_point.c | 60 ++++++++++++------------
src/include/storage/subsystemlist.h | 2 +-
src/include/utils/injection_point.h | 3 --
4 files changed, 30 insertions(+), 38 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a2ef290b5b3..9d1b87e863e 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,7 +52,6 @@
#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
#include "utils/wait_event.h"
/* GUCs */
@@ -139,7 +138,6 @@ CalculateShmemSize(void)
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
@@ -355,7 +353,6 @@ CreateOrAttachShmemStructs(void)
AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
- InjectionPointShmemInit();
AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..62e5cfcf4f9 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,46 +232,38 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ static ShmemStructDesc InjectionPointShmemDesc;
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+ ShmemRequestStruct(&InjectionPointShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ });
}
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
+}
+#endif
+
/*
* Attach a new injection point.
*/
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..65da6f17c5d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,4 @@
* of these matter.
*/
-/* TODO: empty for now */
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
--
2.47.3
[text/x-patch] v8-0011-Convert-test_aio-to-use-the-new-mechanism.patch (4.5K, 12-v8-0011-Convert-test_aio-to-use-the-new-mechanism.patch)
download | inline diff:
From 8d306404b81126a386a7b169c60f09c4972ebf5d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:45 +0200
Subject: [PATCH v8 11/16] Convert test_aio to use the new mechanism
I wanted to use this to showcase SHMEM_ALLOW_ALLOC_AFTER_STARTUP, but
unfortunately that didn't work because then we'd call
InjectionPointLoad from _PG_init().
---
src/test/modules/test_aio/test_aio.c | 106 ++++++++++++---------------
1 file changed, 47 insertions(+), 59 deletions(-)
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d687408af0c..2ba5f779b80 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -25,7 +25,6 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/checksum.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -35,6 +34,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
bool enabled_short_read;
@@ -46,75 +46,66 @@ typedef struct InjIoErrorState
static InjIoErrorState *inj_io_error_state;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-
-static PgAioHandle *last_handle;
+static void inj_io_shmem_request(void *arg);
+static void inj_io_shmem_init(void *arg);
+static void inj_io_shmem_attach(void *arg);
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = inj_io_shmem_request,
+ .init_fn = inj_io_shmem_init,
+ .attach_fn = inj_io_shmem_attach,
+};
+static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+inj_io_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc inj_io_shmem_desc;
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(&inj_io_shmem_desc, &(ShmemRequestStructOpts) {
+ .name = "test_aio",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ });
}
static void
-test_aio_shmem_startup(void)
+inj_io_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_short_read",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
-
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_short_read",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+inj_io_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -123,10 +114,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.47.3
[text/x-patch] v8-0012-Use-the-new-mechanism-in-a-few-core-subsystems.patch (44.1K, 13-v8-0012-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From 6060c473fac7fafe592ca8683a57468dc0aef7be Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:06:04 +0200
Subject: [PATCH v8 12/16] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/varsup.c | 36 ++---
src/backend/postmaster/launch_backend.c | 2 +
src/backend/storage/ipc/dsm.c | 65 +++++----
src/backend/storage/ipc/dsm_registry.c | 39 +++---
src/backend/storage/ipc/ipci.c | 30 ----
src/backend/storage/ipc/pmsignal.c | 57 ++++----
src/backend/storage/ipc/procarray.c | 119 ++++++++--------
src/backend/storage/ipc/procsignal.c | 66 ++++-----
src/backend/storage/ipc/sinvaladt.c | 40 +++---
src/backend/storage/lmgr/proc.c | 176 ++++++++++++------------
src/backend/utils/activity/wait_event.c | 96 +++++++------
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 1 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 19 +++
src/include/utils/wait_event.h | 2 -
21 files changed, 377 insertions(+), 387 deletions(-)
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..1eb72d44155 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,28 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc TransamVariablesShmemDesc;
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(&TransamVariablesShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ });
}
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 75423104be8..7b81200d3c2 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -663,6 +663,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..9fadb4e6cfd 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -110,6 +111,14 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -463,43 +472,45 @@ dsm_set_control_handle(dsm_handle h)
}
#endif
+static ShmemStructDesc dsm_main_space_shmem_desc;
+
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ size_t size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (size == 0)
+ return;
+
+ ShmemRequestStruct(&dsm_main_space_shmem_desc, &(ShmemRequestStructOpts) {
+ .name = "Preallocated DSM",
+ .size = size,
+ .ptr = &dsm_main_space_begin,
+ });
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..92385a08727 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,23 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ static ShmemStructDesc DSMRegistryCtxShmemDesc;
+
+ ShmemRequestStruct(&DSMRegistryCtxShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ });
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 9d1b87e863e..0a4e9ee6502 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -41,18 +40,13 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -102,14 +96,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -119,11 +109,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -137,7 +123,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
@@ -288,13 +273,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -317,23 +298,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
@@ -352,7 +323,6 @@ CreateOrAttachShmemStructs(void)
SyncScanShmemInit();
AsyncShmemInit();
StatsShmemInit();
- WaitEventCustomShmemInit();
AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..d901f8e9947 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,31 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
-
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
-
- return size;
+ static ShmemStructDesc PMSignalShmemDesc;
+ size_t size;
+
+ num_child_flags = MaxLivePostmasterChildren();
+
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(&PMSignalShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ });
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..1508c37ef62 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,20 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
+static ShmemStructDesc ProcArrayShmemDesc;
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +284,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +294,15 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc;
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +393,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +418,52 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(&KnownAssignedXidsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ });
+
+ ShmemRequestStruct(&KnownAssignedXidsValidShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ });
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(&ProcArrayShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ });
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..9ed24fae8d9 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -32,6 +32,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -105,7 +106,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -113,51 +123,41 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
+ static ShmemStructDesc ProcSignalShmemDesc;
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(&ProcSignalShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ });
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..11138af0a23 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,37 +221,32 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
+ static ShmemStructDesc SharedInvalShmemDesc;
Size size;
size = offsetof(SISeg, procState);
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(&SharedInvalShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ });
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..e7501fee4c7 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,25 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalShmemDesc;
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc;
+static ShmemStructDesc FastPathLockArrayShmemDesc;
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -82,24 +99,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -107,8 +106,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -127,25 +124,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -160,7 +138,63 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ShmemRequestStruct(&ProcGlobalAllProcsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "PGPROC structures",
+ .size = size,
+ .ptr = (void **) &tmpAllProcs,
+ });
+
+ ShmemRequestStruct(&FastPathLockArrayShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Fast-Path Lock Array",
+ .size = IsUnderPostmaster ? SHMEM_REQUEST_UNKNOWN_SIZE : FastPathLockShmemSize(),
+ .ptr = (void **) &tmpFastPathLockArray,
+ });
+
+ ShmemRequestStruct(&ProcGlobalShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs
+ * to be accessed early at backend startup, before
+ * ShmemAttachRequested() has been called.
+ */
+ .ptr = (void **) &ProcGlobal,
+ });
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +213,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
-
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +242,12 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +256,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,31 +271,25 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index e5a2289f0b0..56b455d0023 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -97,61 +98,58 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ static ShmemStructDesc WaitEventCustomCounterShmemDesc;
+ static ShmemHashDesc WaitEventCustomHashByInfoDesc;
+ static ShmemHashDesc WaitEventCustomHashByNameDesc;
+
+ ShmemRequestStruct(&WaitEventCustomCounterShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ });
+ ShmemRequestHash(&WaitEventCustomHashByInfoDesc, &(ShmemRequestHashOpts) {
+ .name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ });
+ ShmemRequestHash(&WaitEventCustomHashByNameDesc, &(ShmemRequestHashOpts) {
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ });
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 1dad125706e..3729f202fff 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -551,7 +551,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..031897015f4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -63,9 +63,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 65da6f17c5d..5c11b2b3499 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,23 @@
* of these matter.
*/
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
--
2.47.3
[text/x-patch] v8-0013-Convert-SLRUs-to-use-the-new-interface.patch (85.5K, 14-v8-0013-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From 95f4a4367d8e306f16870a10ecc53b8b6ec341b1 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 26 Mar 2026 20:39:46 +0200
Subject: [PATCH v8 13/16] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 92 +++---
src/backend/access/transam/multixact.c | 140 +++++----
src/backend/access/transam/slru.c | 364 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 117 ++++----
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 299 +++++++++----------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 104 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem.h | 1 +
src/include/storage/subsystemlist.h | 8 +
src/test/modules/test_slru/test_slru.c | 110 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 716 insertions(+), 675 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..2e9a5cfa19e 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(&XactSlruDesc, &(SlruRequestOpts) {
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ });
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..d6e67cc805d 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,9 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
+
+static ShmemStructDesc CommitTsShmemDesc;
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +128,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +524,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +550,37 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(&CommitTsSlruDesc, &(SlruRequestOpts) {
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
+
+ .nslots = CommitTsShmemBuffers(),
+
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
+
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ });
+
+ ShmemRequestStruct(&CommitTsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ });
+}
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
-
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
-
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..f9cb0ec8a3d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,83 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
+ static ShmemStructDesc MultiXactShmemDesc;
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(&MultiXactShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ });
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(&MultiXactOffsetSlruDesc, &(SlruRequestOpts) {
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ });
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(&MultiXactMemberSlruDesc, &(SlruRequestOpts) {
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ });
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..c3fe00d4039 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -71,6 +71,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +90,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +102,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +111,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +177,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +197,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +239,134 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequest(SlruDesc *desc, const SlruRequestOpts *options)
{
+ SlruRequestOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruRequestOpts));
+ memcpy(options_copy, options, sizeof(SlruRequestOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *base_options)
+{
+ const SlruRequestOpts *options = (SlruRequestOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruRequestOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *base_options)
+{
+ const SlruRequestOpts *options = (SlruRequestOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruRequestOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +392,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +445,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +461,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +487,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +545,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +649,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +696,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +776,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +790,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +848,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +920,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1052,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1092,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1107,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1122,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1135,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1214,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1306,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1318,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1367,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1437,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1453,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1475,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1503,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1548,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1571,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1648,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1669,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1688,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1745,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1760,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1776,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1792,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1806,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1839,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1847,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1861,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1879,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..469ac02a1a3 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SubTransSlruDesc, &(SlruRequestOpts) {
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ });
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 5c9a56c3d40..6f4639efea4 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,65 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
+ static ShmemStructDesc AsyncQueueControlShmemDesc;
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(&AsyncQueueControlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ });
- return size;
-}
+ SimpleLruRequest(&NotifySlruDesc, &(SlruRequestOpts) {
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ });
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0a4e9ee6502..32255ead2bf 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -98,16 +98,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
@@ -121,7 +116,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
@@ -279,10 +273,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -290,11 +280,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -321,7 +306,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
AioShmemInit();
WaitLSNShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 84099ce78fe..595f1e582d5 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -135,6 +135,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -539,6 +540,9 @@ AttachOrInit(ShmemRequest *request, bool init_allowed, bool attach_allowed)
case SHMEM_KIND_HASH:
shmem_hash_attach(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(desc, request->options);
+ break;
}
}
else if (!init_allowed)
@@ -593,6 +597,9 @@ AttachOrInit(ShmemRequest *request, bool init_allowed, bool attach_allowed)
case SHMEM_KIND_HASH:
shmem_hash_init(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(desc, request->options);
+ break;
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 4f80fc73639..6475b50a2ab 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,16 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static ShmemStructDesc PredXactListShmemDesc;
+
+static int64 max_serializable_xacts;
+
+static ShmemStructDesc RWConflictPoolShmemDesc;
+
+static ShmemStructDesc FinishedSerializableShmemDesc;
+
+static ShmemStructDesc SerialControlShmemDesc;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +464,18 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
+static ShmemHashDesc SerializableXidHashDesc;
+
+static ShmemHashDesc PredicateLockTargetHashDesc;
+
+static ShmemHashDesc PredicateLockHashDesc;
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,73 +1127,61 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
+ ShmemRequestHash(&PredicateLockTargetHashDesc, &(ShmemRequestHashOpts) {
+ .name = "PREDICATELOCKTARGET hash",
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ .init_size = max_predicate_lock_targets,
+ .max_size = max_predicate_lock_targets,
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ });
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(&PredicateLockHashDesc, &(ShmemRequestHashOpts) {
+ .name = "PREDICATELOCK hash",
+
+ .init_size = max_predicate_locks,
+ .max_size = max_predicate_locks,
+
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ });
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1179,30 +1194,32 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&PredXactListShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ });
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
+ ShmemRequestHash(&SerializableXidHashDesc, &(ShmemRequestHashOpts) {
+ .name = "SERIALIZABLEXID hash",
+
+ .init_size = max_serializable_xacts,
+ .max_size = max_serializable_xacts,
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ });
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1217,58 +1234,53 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(&RWConflictPoolShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ });
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&FinishedSerializableShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ });
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SerialSlruDesc, &(SlruRequestOpts) {
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ });
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&SerialControlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ });
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1280,7 +1292,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1322,6 +1333,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1338,63 +1352,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
- /*
- * Since NPREDICATELOCKTARGETENTS is only an estimate, add 10% safety
- * margin.
- */
- size = add_size(size, size / 10);
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
-
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..d4c669aa7a2 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,20 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruRequestOpts
{
- SlruShared shared;
-
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ ShmemRequestStructOpts base;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * name of SLRU. (This is user-visible, pick with care!)
*/
- bool long_segment_names;
+ const char *name;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +128,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +164,28 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruRequestOpts;
-typedef SlruCtlData *SlruCtl;
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ ShmemStructDesc base;
+
+ SlruRequestOpts options;
+
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +194,48 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequest(SlruDesc *desc, const SlruRequestOpts *options);
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *options);
+extern void shmem_slru_attach(ShmemStructDesc *base_desc, const ShmemRequestStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index a07fccfb1ba..f59460ae10f 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -29,6 +29,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemAreaKind;
/*
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 5c11b2b3499..63d1d60ae36 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -25,6 +25,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -38,5 +45,6 @@ PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..c017838a694 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,48 +212,6 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
-{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /*
- * Create the SLRU directory if it does not exist yet, from the root of
- * the data directory.
- */
- (void) MakePGDirectory(slru_dir_name);
-
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
-}
-
void
_PG_init(void)
{
@@ -269,9 +221,37 @@ _PG_init(void)
errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
"test_slru")));
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ /*
+ * Create the SLRU directory if it does not exist yet, from the root of
+ * the data directory.
+ */
+ (void) MakePGDirectory(TestSlruDir);
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
+}
+
+static void
+test_slru_shmem_request(void *arg)
+{
+ SimpleLruRequest(&TestSlruDesc, &(SlruRequestOpts) {
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
+
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ });
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 139cb6f9da5..b657f8956dc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2886,10 +2886,10 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
SlruPageStatus
+SlruRequestOpts
SlruScanCallback
SlruSegState
SlruShared
--
2.47.3
[text/x-patch] v8-0014-Convert-AIO-to-the-new-interface.patch (14.8K, 15-v8-0014-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From 5556f2b13649378f035ac99446558485619869e9 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH v8 14/16] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/storage/aio/aio_init.c | 119 ++++++++++++++--------
src/backend/storage/aio/method_io_uring.c | 42 ++++----
src/backend/storage/aio/method_worker.c | 84 ++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/include/storage/aio_internal.h | 16 +--
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 155 insertions(+), 115 deletions(-)
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..54ab1238131 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,46 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
-
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
+
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static ShmemStructDesc AioCtlShmemDesc = {
+ .name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static ShmemStructDesc AioBackendShmemDesc = {
+ .name = "AioBackend",
+ .ptr = (void **) &AioBackendShmemPtr,
+};
+static PgAioHandle *AioHandleShmemPtr;
+static ShmemStructDesc AioHandleShmemDesc = {
+ .name = "AioHandle",
+ .ptr = (void **) &AioHandleShmemPtr,
+};
+static struct iovec *AioHandleIOVShmemPtr;
+static ShmemStructDesc AioHandleIOVShmemDesc = {
+ .name = "AioHandleIOV",
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+};
+static uint64 *AioHandleDataShmemPtr;
+static ShmemStructDesc AioHandleDataShmemDesc = {
+ .name = "AioHandleData",
+ .ptr = (void **) &AioHandleDataShmemPtr,
+};
static uint32
AioProcs(void)
@@ -109,10 +139,13 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
+ /* Resolve io_max_concurrency if not already done. */
/*
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
@@ -132,48 +165,41 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
+ ShmemRequestStruct(&AioCtlShmemDesc);
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
+ AioBackendShmemDesc.size = AioBackendShmemSize();
+ ShmemRequestStruct(&AioBackendShmemDesc);
- return sz;
+ AioHandleShmemDesc.size = AioHandleShmemSize();
+ ShmemRequestStruct(&AioHandleShmemDesc);
+
+ AioHandleIOVShmemDesc.size = AioHandleIOVShmemSize();
+ ShmemRequestStruct(&AioHandleIOVShmemDesc);
+
+ AioHandleDataShmemDesc.size = AioHandleDataShmemSize();
+ ShmemRequestStruct(&AioHandleDataShmemDesc);
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +234,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 4867ded35ea..df2d01d66fa 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -58,7 +58,6 @@ static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -69,8 +68,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -265,23 +264,34 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioUringShmemDesc = {
+ .name = "AioUringContext",
+ .ptr = (void **) &pgaio_uring_contexts,
+ };
+
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ AioUringShmemDesc.size = pgaio_uring_shmem_size();
+ ShmemRequestStruct(&AioUringShmemDesc);
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -289,13 +299,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index efe38e9f113..82c8b098a9e 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
+
+ pgaio_worker_set_secondary_ptr();
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioWorkerShmemDesc = {
+ .name = "AioWorkerSubmissionQueue",
+ .ptr = (void **) &io_worker_submission_queue,
+ };
+ int queue_size;
+
+ AioWorkerShmemDesc.size =
+ MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+ ShmemRequestStruct(&AioWorkerShmemDesc);
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 32255ead2bf..b945035d98a 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -118,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
@@ -307,7 +306,6 @@ CreateOrAttachShmemStructs(void)
BTreeShmemInit();
SyncScanShmemInit();
StatsShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 5feea15be9e..9dd8d63b25c 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 63d1d60ae36..e8e06be30c2 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -48,3 +48,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
--
2.47.3
[text/x-patch] v8-0015-Add-option-for-aligning-shmem-allocations.patch (3.9K, 16-v8-0015-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From 9fdc8cef18e54f939dca0db203d80f1baecc86ae Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH v8 15/16] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 22 +++++++++++++---------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 595f1e582d5..d3808432ff1 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -237,7 +237,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -400,6 +400,7 @@ ShmemGetRequestedSize(void)
size = add_size(size, request->options->size);
size = add_size(size, request->options->extra_size);
+ size = add_size(size, request->options->alignment);
}
return size;
@@ -571,7 +572,7 @@ AttachOrInit(ShmemRequest *request, bool init_allowed, bool attach_allowed)
size_t allocated_size;
void *structPtr;
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size, request->options->alignment, &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -730,7 +731,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -749,7 +750,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -759,8 +760,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -776,14 +778,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -797,8 +800,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index f59460ae10f..150c86d5884 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -59,6 +59,12 @@ typedef struct ShmemRequestStructOpts
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* Extra space to reserve in the shared memory segment, but it's not part
* of the struct itself. This is used for shared memory hash tables that
--
2.47.3
[text/x-patch] v8-0016-Convert-all-remaining-subsystems-to-use-the-new-A.patch (117.5K, 17-v8-0016-Convert-all-remaining-subsystems-to-use-the-new-A.patch)
download | inline diff:
From 85d226598d50aa78f9119f581d9ccfb931547a39 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 02:31:06 +0200
Subject: [PATCH v8 16/16] Convert all remaining subsystems to use the new API
---
src/backend/access/common/syncscan.c | 79 ++++----
src/backend/access/nbtree/nbtutils.c | 56 +++---
src/backend/access/transam/twophase.c | 77 +++----
src/backend/access/transam/xlog.c | 86 ++++----
src/backend/access/transam/xlogprefetcher.c | 54 ++---
src/backend/access/transam/xlogrecovery.c | 36 ++--
src/backend/access/transam/xlogwait.c | 52 ++---
src/backend/postmaster/autovacuum.c | 81 ++++----
src/backend/postmaster/bgworker.c | 107 +++++-----
src/backend/postmaster/checkpointer.c | 58 +++---
src/backend/postmaster/pgarch.c | 46 +++--
src/backend/postmaster/walsummarizer.c | 64 +++---
src/backend/replication/logical/launcher.c | 58 +++---
src/backend/replication/logical/logicalctl.c | 30 +--
src/backend/replication/logical/origin.c | 61 +++---
src/backend/replication/logical/slotsync.c | 44 ++--
src/backend/replication/slot.c | 66 +++---
src/backend/replication/walreceiverfuncs.c | 52 ++---
src/backend/replication/walsender.c | 61 +++---
src/backend/storage/aio/aio_init.c | 71 +++----
src/backend/storage/aio/method_io_uring.c | 12 +-
src/backend/storage/aio/method_worker.c | 16 +-
src/backend/storage/buffer/buf_init.c | 164 +++++++--------
src/backend/storage/buffer/buf_table.c | 40 ++--
src/backend/storage/buffer/freelist.c | 94 ++++-----
src/backend/storage/ipc/ipci.c | 119 +----------
src/backend/storage/lmgr/lock.c | 126 ++++++------
src/backend/utils/activity/backend_status.c | 190 ++++++++----------
src/backend/utils/activity/pgstat_shmem.c | 161 ++++++++-------
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/buf_internals.h | 6 +-
src/include/storage/bufmgr.h | 4 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 26 +++
src/include/utils/backend_status.h | 8 -
.../injection_points/injection_points.c | 60 ++----
55 files changed, 1061 insertions(+), 1259 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..45db6bbc8b7 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,50 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ static ShmemStructDesc SyncScanShmemDesc;
+
+ ShmemRequestStruct(&SyncScanShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ });
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get
+ * replaced with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..1ac856ebdd5 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,39 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
+ static ShmemStructDesc BTreeShmemDesc;
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(&BTreeShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ });
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but
+ * having it always start the same doesn't seem good. Seed with
+ * low-order bits of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..88a931d5028 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,11 +243,12 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
+ static ShmemStructDesc TwoPhaseShmemDesc;
Size size;
/* Need the fixed struct, the array of pointers, and the GTD structs */
@@ -248,46 +258,41 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(&TwoPhaseShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ });
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by InitProcGlobal */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f5c9a34374d..660b530fe52 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -94,6 +94,7 @@
#include "storage/procarray.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -566,6 +567,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -574,6 +585,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -4923,7 +4935,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
}
@@ -4939,11 +4952,13 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
+ static ShmemStructDesc XLogCtlShmemDesc;
+ static ShmemStructDesc ControlFileShmemDesc;
Size size;
/*
@@ -4982,23 +4997,26 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(&XLogCtlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ });
+ ShmemRequestStruct(&ControlFileShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ });
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5016,36 +5034,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5102,6 +5101,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..006c45b817a 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,28 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ static ShmemStructDesc XLogPrefetchShmemDesc;
+
+ ShmemRequestStruct(&XLogPrefetchShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ });
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +340,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..d87c4059dac 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,23 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
+ static ShmemStructDesc XLogRecoveryShmemDesc;
- return size;
+ ShmemRequestStruct(&XLogRecoveryShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ });
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..830ead44bdd 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,36 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
+ static ShmemStructDesc WaitLSNShmemDesc;
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(&WaitLSNShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ });
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 7ecb069c248..fe02425987a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -97,6 +97,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -304,6 +305,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3354,12 +3363,13 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
+ static ShmemStructDesc AutoVacuumShmemDesc;
Size size;
/*
@@ -3369,53 +3379,42 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(&AutoVacuumShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ });
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index d1fe3cc71ce..93d05680aeb 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -29,6 +29,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -109,6 +110,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -151,77 +160,71 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
+ static ShmemStructDesc BackgroundWorkerShmemDesc;
Size size;
/* Array of workers is variably sized. */
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(&BackgroundWorkerShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ });
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array
+ * in shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..c04faf9a1fd 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,12 +959,13 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
+ static ShmemStructDesc CheckpointerShmemDesc;
Size size;
/*
@@ -967,39 +977,25 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(&CheckpointerShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ });
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..cc983675373 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,34 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
-
- size = add_size(size, sizeof(PgArchData));
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- return size;
-}
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc PgArchShmemDesc;
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
+ ShmemRequestStruct(&PgArchShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ });
+}
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
+{
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index 0c0670f7da9..99ef65f09ba 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,38 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ static ShmemStructDesc WalSummarizerShmemDesc;
+
+ ShmemRequestStruct(&WalSummarizerShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ });
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization
+ * will happen when GetOldestUnsummarizedLSN() is called for the first
+ * time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..61a162773d8 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,12 +981,13 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
+ static ShmemStructDesc ApplyLauncherShmemDesc;
Size size;
/*
@@ -987,7 +997,11 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(&ApplyLauncherShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ });
}
/*
@@ -1028,35 +1042,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..86b48edd8e6 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,16 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
+ static ShmemStructDesc LogicalDecodingCtlShmemDesc;
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(&LogicalDecodingCtlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ });
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..1a7316b2338 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,50 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
+ static ShmemStructDesc ReplicationOriginShmemDesc;
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(&ReplicationOriginShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ });
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..22c5f8e48a2 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,29 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ static ShmemStructDesc SlotSyncShmemDesc;
+
+ ShmemRequestStruct(&SlotSyncShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ });
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..b68b27c356d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,43 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc ReplicationSlotsShmemDesc;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(&ReplicationSlotsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ });
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..49c203f915e 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,49 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalRcvShmemDesc;
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(&WalRcvShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ });
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 66507e9c2dd..7255a9b5e2e 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3763,47 +3772,39 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalSndShmemDesc;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(&WalSndShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ });
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index 54ab1238131..fead2fd3776 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -37,32 +37,11 @@ const ShmemCallbacks AioShmemCallbacks = {
.attach_fn = AioShmemAttach,
};
-static ShmemStructDesc AioCtlShmemDesc = {
- .name = "AioCtl",
- .size = sizeof(PgAioCtl),
- .ptr = (void **) &pgaio_ctl,
-};
static PgAioBackend *AioBackendShmemPtr;
-static ShmemStructDesc AioBackendShmemDesc = {
- .name = "AioBackend",
- .ptr = (void **) &AioBackendShmemPtr,
-};
static PgAioHandle *AioHandleShmemPtr;
-static ShmemStructDesc AioHandleShmemDesc = {
- .name = "AioHandle",
- .ptr = (void **) &AioHandleShmemPtr,
-};
static struct iovec *AioHandleIOVShmemPtr;
-static ShmemStructDesc AioHandleIOVShmemDesc = {
- .name = "AioHandleIOV",
- .ptr = (void **) &AioHandleIOVShmemPtr,
-};
static uint64 *AioHandleDataShmemPtr;
-static ShmemStructDesc AioHandleDataShmemDesc = {
- .name = "AioHandleData",
- .ptr = (void **) &AioHandleDataShmemPtr,
-};
static uint32
AioProcs(void)
@@ -145,9 +124,15 @@ AioChooseMaxConcurrency(void)
static void
AioShmemRequest(void *arg)
{
- /* Resolve io_max_concurrency if not already done. */
+ static ShmemStructDesc AioCtlShmemDesc;
+ static ShmemStructDesc AioBackendShmemDesc;
+ static ShmemStructDesc AioHandleShmemDesc;
+ static ShmemStructDesc AioHandleIOVShmemDesc;
+ static ShmemStructDesc AioHandleDataShmemDesc;
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -165,19 +150,35 @@ AioShmemRequest(void *arg)
PGC_S_OVERRIDE);
}
- ShmemRequestStruct(&AioCtlShmemDesc);
-
- AioBackendShmemDesc.size = AioBackendShmemSize();
- ShmemRequestStruct(&AioBackendShmemDesc);
-
- AioHandleShmemDesc.size = AioHandleShmemSize();
- ShmemRequestStruct(&AioHandleShmemDesc);
-
- AioHandleIOVShmemDesc.size = AioHandleIOVShmemSize();
- ShmemRequestStruct(&AioHandleIOVShmemDesc);
-
- AioHandleDataShmemDesc.size = AioHandleDataShmemSize();
- ShmemRequestStruct(&AioHandleDataShmemDesc);
+ ShmemRequestStruct(&AioCtlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ });
+
+ ShmemRequestStruct(&AioBackendShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ });
+
+ ShmemRequestStruct(&AioHandleShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ });
+
+ ShmemRequestStruct(&AioHandleIOVShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ });
+
+ ShmemRequestStruct(&AioHandleDataShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ });
if (pgaio_method_ops->shmem_callbacks.request_fn)
pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index df2d01d66fa..d08be794a11 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -273,10 +273,7 @@ pgaio_uring_shmem_size(void)
static void
pgaio_uring_shmem_request(void *arg)
{
- static ShmemStructDesc AioUringShmemDesc = {
- .name = "AioUringContext",
- .ptr = (void **) &pgaio_uring_contexts,
- };
+ static ShmemStructDesc AioUringShmemDesc;
/*
* Kernel and liburing support for various features influences how much
@@ -284,8 +281,11 @@ pgaio_uring_shmem_request(void *arg)
*/
pgaio_uring_check_capabilities();
- AioUringShmemDesc.size = pgaio_uring_shmem_size();
- ShmemRequestStruct(&AioUringShmemDesc);
+ ShmemRequestStruct(&AioUringShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioUringContext",
+ .size =pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ });
}
static void
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index 82c8b098a9e..7dc9a51f5ee 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -162,16 +162,18 @@ pgaio_worker_shmem_attach(void *arg)
static void
pgaio_worker_shmem_request(void *arg)
{
- static ShmemStructDesc AioWorkerShmemDesc = {
- .name = "AioWorkerSubmissionQueue",
- .ptr = (void **) &io_worker_submission_queue,
- };
+ static ShmemStructDesc AioWorkerShmemDesc;
int queue_size;
+ size_t size;
- AioWorkerShmemDesc.size =
- MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
pgaio_worker_control_shmem_size();
- ShmemRequestStruct(&AioWorkerShmemDesc);
+
+ ShmemRequestStruct(&AioWorkerShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ });
}
static int
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..bfee62b8208 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,39 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
- /* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
-
- /* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ static ShmemStructDesc BufferDescriptorsShmemDesc;
+ static ShmemStructDesc BufferBlocksShmemDesc;
+ static ShmemStructDesc BufferIOCVArrayShmemDesc;
+ static ShmemStructDesc CkptBufferIdsShmemDesc;
+
+ ShmemRequestStruct(&BufferDescriptorsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ });
+
+ ShmemRequestStruct(&BufferBlocksShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
+ /* Align buffer pool on IO page size boundary. */
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ });
+
+ ShmemRequestStruct(&BufferIOCVArrayShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ });
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +112,51 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(&CkptBufferIdsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ });
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
-
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ BufferDesc *buf = GetBufferDescriptor(i);
- ClearBufferTag(&buf->tag);
+ ClearBufferTag(&buf->tag);
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- buf->buf_id = i;
+ buf->buf_id = i;
- pgaio_wref_clear(&buf->io_wref);
+ pgaio_wref_clear(&buf->io_wref);
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index 23d85fd32e2..ebac60f8482 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -32,37 +32,25 @@ typedef struct
static HTAB *SharedBufHash;
-
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
-
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(int size)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size, size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ static ShmemHashDesc SharedBufHashDesc;
+
+ ShmemRequestHash(&SharedBufHashDesc, &(ShmemRequestHashOpts) {
+ .name = "Shared Buffer Lookup Table",
+ .max_size = size,
+ .init_size = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ });
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..ef21a0175b5 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,41 +379,22 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- register shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
+ static ShmemStructDesc StrategyCtlShmemDesc;
- return size;
-}
-
-/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
- */
-void
-StrategyInitialize(bool init)
-{
- bool found;
+ ShmemRequestStruct(&StrategyCtlShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ });
/*
- * Initialize the shared buffer lookup hashtable.
+ * Request the shared buffer lookup hashtable.
*
* Since we can't tolerate running out of lookup table entries, we must be
* sure to specify an adequate table size here. The maximum steady-state
@@ -412,37 +403,26 @@ StrategyInitialize(bool init)
* happening in each partition concurrently, so we could need as many as
* NBuffers + NUM_BUFFER_PARTITIONS entries.
*/
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
+ BufTableShmemRequest(NBuffers + NUM_BUFFER_PARTITIONS);
+}
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+/*
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
+ */
+static void
+StrategyCtlShmemInit(void *arg)
+{
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index b945035d98a..217f5291270 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,36 +14,13 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
@@ -55,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -95,32 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, LWLockShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -159,7 +108,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -217,9 +165,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -248,68 +193,6 @@ RegisterBuiltinShmemCallbacks(void)
RegisterShmemCallbacks(builtin_subsystems[i]);
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
- BufferManagerShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 234643e4dd7..566374b9c40 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,22 +443,19 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
- int64 init_table_size,
+ static ShmemHashDesc LockHashDesc;
+ static ShmemHashDesc ProcLockHashDesc;
+ static ShmemStructDesc FastPathShmemDesc;
+ long init_table_size,
max_table_size;
- bool found;
/*
* Compute init/max size to request for lock hashtables. Note these
@@ -456,47 +464,51 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
init_table_size = max_table_size / 2;
- /*
- * Allocate hash table for LOCK structs. This stores per-locked-object
- * information.
- */
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- init_table_size,
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
+ ShmemRequestHash(&LockHashDesc, &(ShmemRequestHashOpts) {
+ .name = "LOCK hash",
+ .init_size = init_table_size,
+ .max_size = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ });
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
init_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- init_table_size,
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION | HASH_PARTITION);
+ ShmemRequestHash(&ProcLockHashDesc, &(ShmemRequestHashOpts) {
+ .name = "PROCLOCK hash",
+ .init_size = init_table_size,
+ .max_size = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ });
+
+ ShmemRequestStruct(&FastPathShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ });
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin.
+ * size = add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3761,30 +3773,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /*
- * Since NLOCKENTS is only an estimate, add 10% safety margin.
- */
- size = add_size(size, size / 10);
-
- return size;
-}
/*
* GetLockStatusData - Return a summary of the lock manager's internal
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..958d4b04ef7 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,114 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ static ShmemStructDesc BackendStatusArrayShmemDesc;
+ static ShmemStructDesc BackendAppnameBufferShmemDesc;
+ static ShmemStructDesc BackendClientHostnameBufferShmemDesc;
+ static ShmemStructDesc BackendActivityBufferSizeShmemDesc;
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendSslStatusBufferShmemDesc;
#endif
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendGssStatusBufferShmemDesc;
+#endif
+
+ ShmemRequestStruct(&BackendStatusArrayShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ });
+
+ ShmemRequestStruct(&BackendAppnameBufferShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ });
+
+ ShmemRequestStruct(&BackendClientHostnameBufferShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ });
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(&BackendActivityBufferSizeShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ });
+
+#ifdef USE_SSL
+ ShmemRequestStruct(&BackendSslStatusBufferShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ });
+#endif
+
+#ifdef ENABLE_GSS
+ ShmemRequestStruct(&BackendGssStatusBufferShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ });
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
- }
-
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
- {
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +194,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +208,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..ac96baf65c9 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -150,101 +158,100 @@ StatsShmemSize(void)
}
/*
- * Initialize cumulative statistics system during startup
+ * Register shared memory area for cumulative statistics
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemRequest(void *arg)
{
- bool found;
- Size sz;
-
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ static ShmemStructDesc StatsShmemDesc;
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ ShmemRequestStruct(&StatsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ });
+}
- Assert(!found);
+/*
+ * Initialize cumulative statistics system during startup
+ */
+static void
+StatsShmemInit(void *arg)
+{
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /*
+ * Create a small dsa allocation in plain shared memory. This is
+ * required because postmaster cannot use dsm segments. It also
+ * provides a small efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily
+ * limit size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice
+ * if there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- if (!kind_info)
- continue;
+ if (!kind_info)
+ continue;
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
+ {
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index dcc12eb8cbe..1a098a91444 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -246,8 +246,6 @@ extern char *GetMockAuthenticationNonce(void);
extern bool DataChecksumsEnabled(void);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 5aa0f3a8ac1..6ebfafe640d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -62,8 +62,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 8d1e16b5d51..d7f8a8c1e63 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -570,12 +570,8 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
+extern void BufTableShmemRequest(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 4017896f951..26441886035 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -369,10 +369,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index e8e06be30c2..206d6b586ad 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -25,10 +25,18 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
@@ -36,6 +44,9 @@ PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -43,11 +54,26 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..c9ab721c3fd 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,26 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc InjectionPointsShmemDesc;
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(&InjectionPointsShmemDesc, &(ShmemRequestStructOpts) {
+ .name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ });
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +587,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-27 07:01 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 3 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-27 07:01 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Wed, Mar 25, 2026 at 9:35 PM Ashutosh Bapat
<[email protected]> wrote:
>
> On Tue, Mar 24, 2026 at 9:02 PM Ashutosh Bapat
> <[email protected]> wrote:
> >
> >
> > I will continue from 0008 tomorrow.
> >
>
> I reviewed the documentation part of 0008. I have a few edits attached.
>
> I have just one comment that's not covered in the edits
>
> @@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
> <para>
> Anonymous allocations are allocations that have been made
> with <literal>ShmemAlloc()</literal> directly, rather than via
> - <literal>ShmemInitStruct()</literal> or
> - <literal>ShmemInitHash()</literal>.
> + <literal>ShmemRequestStruct()</literal> or
> + <literal>ShmemRequestHash()</literal>.
> </para>
>
> ShmemInitStruct() and ShmemInitHash() are still the functions to
> allocate named structures. If we are going to keep ShmemInitStruct()
> and ShmemInitHash() around for a while, I think it is more accurate to
> mention them in this sentence along with the new functions.
>
> Will continue reviewing the patch tomorrow.
>
Here's a complete review of 0008 (from version 8). I see you have
already posted v9 which I have not looked at. Please feel free to
ignore comments which aren't applicable anymore. Attached patch has
minor edits and some code arrangement to make it more readable as
mentioned in the comments below. Please incorporate those changes as
applicable if you find them useful.
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves a somes memory for
+ allocations after startup, but that reservation is small.
Do we check the request against the reserved amount or overall memory
available? If later, a large request after startup can cause memory
allocated for hash tables to be taken away. This was a problem in the
previous implementation as well, since ShmemAlloc() could be called
after startup. But giving a formal API to do it might encourage more
usage, so it would be good to have some checks in place.
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
It's not clear how we keep the list of registered callbacks across the
backends and also after restart in-sync. How do we make sure that the
callbacks registered at this time are the same callbacks registered
before creating the shared memory? How do we make sure that the
callbacks registered after the startup are also registered after
restart?
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterShmemStructs() here, we keep the old registrations. In
There is no RegisterShmemStructs(). Probably this comment is not reuquired.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified in the
+ * descriptor when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of that.
This wording works well for resizable structures. Thanks.
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
The second sentence onwards belong to shmem_hash.c. Don't they?
+} shmem_startup_state;
This isn't just startup state since the backend can toggle between
DONE and LATE_ATTACH_OR_INIT states after the startup. Probably
"shmem_state" would be a better name.
Also, it might be better to separate the enum and the variable
declaration. I was confused for a moment.
What does B stand for in the enum values?
+static bool AttachOrInit(ShmemStructDesc *desc, bool init_allowed,
bool attach_allowed);
Init in the name can easily lead into thinking that the function is
going to invoke the init callback. I think a better name would be
AttachOrAllocate() or something which can not be confused with Init.
+/*
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
I don't think we need to reiterate the last sentence here, since it's
already mentioned in the "Usage" section of the documentation and this
API is unrelated to that.
+ * Attach to all the requested memory areas.
+ */
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+ while (!dclist_is_empty(&requested_shmem_areas))
+ {
+ requested_shmem_area *area = dlist_container(requested_shmem_area, node,
+ dclist_pop_head_node(&requested_shmem_areas));
Isn't requested_shmem_areas a List*? Why do we need to pop nodes from it?
+ ShmemStructDesc *desc = area->desc;
+
+ AttachOrInit(desc, false, true);
+ }
+ list_free(requested_shmem_areas);
+ requested_shmem_areas = NIL;
If we pop all the nodes from the list, then the list should be NIL
right? Why do we need to free it?
+ else if (!init_allowed)
+ {
For the sake of documentation and sanity, I would add
Assert(!index_entry) here, possibly with a comment. Otherwise it feels
like we might be leaving a half-initialized entry in the hash table.
What if attach_allowed is false and the entry is not found? Should we
throw an error in that case too? It would be foolish to call
AttachOrInit with both init_allowed and attach_allowed set to false,
but the API allows it and we should check for that.
It feels like we should do something about the arguments. The function
is hard to read. init_allowed is actually the action the caller wants
to take if the entry is not found, and attach_allowed is the action
the caller wants to take if the entry is found.
Also explain in the comment what does attach mean here especially in
case of fixed sized structures.
Restructuring the code as attached reads better to me.
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
I still think this requires a different name since it's not undoing
what InitShmemAllocator() did. Maybe ResetShmemState()?
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
... snip ...
+ foreach(lc, requested_shmem_areas)
Doesn't this list contain all the areas, not just registered in this
instance of the call. Does that mean that we need to have all the
attach functions idempotent? Why can't we deal with the newly
registered areas only?
+ * FIXME: What to do if multiple shmem areas were requested, and some
+ * of them are already initialized but not all?
*/
I doubt if we want to allow attaching to areas which are already
attached since the attach_fn may not be idempotent.
/* Initialize the hash header, plus a copy of the table name */
+ Assert(tabname != NULL);
+ Assert(CurrentDynaHashCxt != NULL);
This looks like a separate patch and separate commit.
+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
When we introduce resizable structures (where even the hash table
directly itself could be resizable), we will introduce a new field
max_size which is easy to get confused with extra_size. Maybe we can
rename extra_size to something like "auxilliary_size" to mean size of
the auxiliary parts of the structure which are not part of the main
struct itself.
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
Everytime I look at these two fields, I question whether those are the
number of entries (i.e. size of the hash table) or number of bytes
(size of the memory). I know it's the former, but it indicates that
something needs to be changed here, like changing the names to have
_entries instead of _size, or changing the type to int64 or some such.
Renaming to _entries would conflict with dynahash APIs since they use
_size, so maybe the latter?
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
The comma after EXEC_BACKEND mode is a bit confusing. It makes me
think that the clause after the comma is detached from the
EXEC_BACKEND mode. Maybe revise as
"Shared memory is reserved and allocated to various shared memory
structures in stages at postmaster startup. In EXEC_BACKEND mode,
there's some extra work done to "attach" to them at backend startup.
ShmemCallbacks holds callback functions that are called at different
stages."
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_* flags */
+ int flags;
I think we should define the flags before this. Also SHMEM_ looks too
generic prefix, maybe SHMEM_CALLBACKS_ or SHMEM_CB_. With those
changes it will look something like the attached patch.
@@ -50,7 +50,6 @@ static InjIoErrorState *inj_io_error_state;
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-
It's good to get rid of an extra line, but maybe a separate commit.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] 0008_edits.patch.nocibot (11.8K, 2-0008_edits.patch.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-27 08:58 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
2 siblings, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-27 08:58 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Hi Heikki,
A correction in my previous email.
On Fri, Mar 27, 2026 at 12:31 PM Ashutosh Bapat
<[email protected]> wrote:
>
>
> Here's a complete review of 0008 (from version 8).
This should be 0008 from version 7. I haven't looked at v8 yet.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-27 23:17 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-27 23:17 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Thanks, will incorporate your comments in next version. Replying to just
a few of them here:
On 27/03/2026 09:01, Ashutosh Bapat wrote:
> /* Restore basic shared memory pointers */
> if (UsedShmemSegAddr != NULL)
> + {
> InitShmemAllocator(UsedShmemSegAddr);
> + ShmemCallRequestCallbacks();
>
> It's not clear how we keep the list of registered callbacks across the
> backends and also after restart in-sync. How do we make sure that the
> callbacks registered at this time are the same callbacks registered
> before creating the shared memory? How do we make sure that the
> callbacks registered after the startup are also registered after
> restart?
On Unix systems, the registered callbacks are inherited by fork(), and
also survive over crash restart. With EXEC_BACKEND, the assumption is
that calling a library's _PG_init() function will register the same
callbacks every time. We make the same assumption today with the
shmem_startup hook.
> +void
> +RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
> ... snip ...
> + foreach(lc, requested_shmem_areas)
>
> Doesn't this list contain all the areas, not just registered in this
> instance of the call. Does that mean that we need to have all the
> attach functions idempotent? Why can't we deal with the newly
> registered areas only?
registered_shmem_areas is supposed to be empty when the function is
entered. There's an assertion for that too before the foreach().
However, it's missing this, after processing the list:
list_free_deep(requested_shmem_areas);
requested_shmem_areas = NIL;
Because of that, this will fail if you load multiple extensions that
call RegisterShmemCallbacks() in the same session. Will fix that.
> + /*
> + * Extra space to reserve in the shared memory segment, but it's not part
> + * of the struct itself. This is used for shared memory hash tables that
> + * can grow beyond the initial size when more buckets are allocated.
> + */
> + size_t extra_size;
>
> When we introduce resizable structures (where even the hash table
> directly itself could be resizable), we will introduce a new field
> max_size which is easy to get confused with extra_size. Maybe we can
> rename extra_size to something like "auxilliary_size" to mean size of
> the auxiliary parts of the structure which are not part of the main
> struct itself.
>
> + /*
> + * max_size is the estimated maximum number of hashtable entries. This is
> + * not a hard limit, but the access efficiency will degrade if it is
> + * exceeded substantially (since it's used to compute directory size and
> + * the hash table buckets will get overfull).
> + */
> + size_t max_size;
> +
> + /*
> + * init_size is the number of hashtable entries to preallocate. For a
> + * table whose maximum size is certain, this should be equal to max_size;
> + * that ensures that no run-time out-of-shared-memory failures can occur.
> + */
> + size_t init_size;
>
> Everytime I look at these two fields, I question whether those are the
> number of entries (i.e. size of the hash table) or number of bytes
> (size of the memory). I know it's the former, but it indicates that
> something needs to be changed here, like changing the names to have
> _entries instead of _size, or changing the type to int64 or some such.
> Renaming to _entries would conflict with dynahash APIs since they use
> _size, so maybe the latter?
Agreed.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-30 04:50 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-30 04:50 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, Mar 28, 2026 at 4:47 AM Heikki Linnakangas <[email protected]> wrote:
>
> Thanks, will incorporate your comments in next version. Replying to just
> a few of them here:
>
> On 27/03/2026 09:01, Ashutosh Bapat wrote:
> > /* Restore basic shared memory pointers */
> > if (UsedShmemSegAddr != NULL)
> > + {
> > InitShmemAllocator(UsedShmemSegAddr);
> > + ShmemCallRequestCallbacks();
> >
> > It's not clear how we keep the list of registered callbacks across the
> > backends and also after restart in-sync. How do we make sure that the
> > callbacks registered at this time are the same callbacks registered
> > before creating the shared memory? How do we make sure that the
> > callbacks registered after the startup are also registered after
> > restart?
>
> On Unix systems, the registered callbacks are inherited by fork(), and
> also survive over crash restart. With EXEC_BACKEND, the assumption is
> that calling a library's _PG_init() function will register the same
> callbacks every time. We make the same assumption today with the
> shmem_startup hook.
>
RegisterShmemCallbacks() may be called after the startup, and it will
add new areas to the shared memory. How are those registries synced
across the backends? From your answer below, those registries are not
synced across backends. They will be wiped out by the restart and
won't be registered again. Is that right? I think we need to document
this fact and also the need to call RegisterShmemCallbacks() from all
the backends where the new areas are required after the startup.
Sorry, my question was not complete.
> > +void
> > +RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
> > ... snip ...
> > + foreach(lc, requested_shmem_areas)
> >
> > Doesn't this list contain all the areas, not just registered in this
> > instance of the call. Does that mean that we need to have all the
> > attach functions idempotent? Why can't we deal with the newly
> > registered areas only?
>
> registered_shmem_areas is supposed to be empty when the function is
> entered. There's an assertion for that too before the foreach().
>
> However, it's missing this, after processing the list:
>
> list_free_deep(requested_shmem_areas);
> requested_shmem_areas = NIL;
>
> Because of that, this will fail if you load multiple extensions that
> call RegisterShmemCallbacks() in the same session. Will fix that.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-30 12:20 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
2 siblings, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-03-30 12:20 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Mar 27, 2026 at 12:31 PM Ashutosh Bapat
<[email protected]> wrote:
>
> On Wed, Mar 25, 2026 at 9:35 PM Ashutosh Bapat
> <[email protected]> wrote:
> >
> > On Tue, Mar 24, 2026 at 9:02 PM Ashutosh Bapat
> > <[email protected]> wrote:
> > >
> > >
> > > I will continue from 0008 tomorrow.
> > >
> >
> > I reviewed the documentation part of 0008. I have a few edits attached.
> >
> > I have just one comment that's not covered in the edits
> >
> > @@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
> > <para>
> > Anonymous allocations are allocations that have been made
> > with <literal>ShmemAlloc()</literal> directly, rather than via
> > - <literal>ShmemInitStruct()</literal> or
> > - <literal>ShmemInitHash()</literal>.
> > + <literal>ShmemRequestStruct()</literal> or
> > + <literal>ShmemRequestHash()</literal>.
> > </para>
> >
> > ShmemInitStruct() and ShmemInitHash() are still the functions to
> > allocate named structures. If we are going to keep ShmemInitStruct()
> > and ShmemInitHash() around for a while, I think it is more accurate to
> > mention them in this sentence along with the new functions.
> >
> > Will continue reviewing the patch tomorrow.
> >
>
> Here's a complete review of 0008 (from version 7). I see you have
> already posted v8 which I have not looked at.
I rebased my resizable shared memory structures patch on top of your
v8 patch to check if newer APIs are still useful for resizable
structures. Attached is the resultant WIP patch. I will rebase and
finalize it once these patches are committed.
Here are some review comments coming out of that exercise.
The patch subtly changes what allocated_size means in ShmemIndexEntry.
Without this patch, the next structure started from entry->location +
allocated_bytes whereas now the given structure starts at a type
aligned address whereas the next structure may start anywhere after
size bytes from the type aligned address. I think these patches have
got it right (current head the code gets it correct since it uses the
same alignment for all the structures.). But I think the comments for
allocated_size should make it clear that it's not the size available
to the structure, as it used to be. The current comments and the code
in HEAD may make one think so.
What if the caller changes the ShmemStructDesc, which seems to be the
handle we are talking about upthread? Should we make it opaque so that
the callers can not play with it?
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] resizable_shmem_struct.patch.nocibot (41.1K, 2-resizable_shmem_struct.patch.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-03-30 20:15 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-03-30 20:15 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 30/03/2026 07:50, Ashutosh Bapat wrote:
> On Sat, Mar 28, 2026 at 4:47 AM Heikki Linnakangas <[email protected]> wrote:
>> On 27/03/2026 09:01, Ashutosh Bapat wrote:
>>> /* Restore basic shared memory pointers */
>>> if (UsedShmemSegAddr != NULL)
>>> + {
>>> InitShmemAllocator(UsedShmemSegAddr);
>>> + ShmemCallRequestCallbacks();
>>>
>>> It's not clear how we keep the list of registered callbacks across the
>>> backends and also after restart in-sync. How do we make sure that the
>>> callbacks registered at this time are the same callbacks registered
>>> before creating the shared memory? How do we make sure that the
>>> callbacks registered after the startup are also registered after
>>> restart?
>>
>> On Unix systems, the registered callbacks are inherited by fork(), and
>> also survive over crash restart. With EXEC_BACKEND, the assumption is
>> that calling a library's _PG_init() function will register the same
>> callbacks every time. We make the same assumption today with the
>> shmem_startup hook.
>
> RegisterShmemCallbacks() may be called after the startup, and it will
> add new areas to the shared memory. How are those registries synced
> across the backends? From your answer below, those registries are not
> synced across backends. They will be wiped out by the restart and
> won't be registered again. Is that right? I think we need to document
> this fact and also the need to call RegisterShmemCallbacks() from all
> the backends where the new areas are required after the startup.
Correct. Ok, I'll add a note to comment on RegisterShmemCallbacks() to
call that out more explicitly, hope it helps.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-01 11:59 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-01 11:59 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, Mar 31, 2026 at 1:45 AM Heikki Linnakangas <[email protected]> wrote:
>
> On 30/03/2026 07:50, Ashutosh Bapat wrote:
> > On Sat, Mar 28, 2026 at 4:47 AM Heikki Linnakangas <[email protected]> wrote:
> >> On 27/03/2026 09:01, Ashutosh Bapat wrote:
> >>> /* Restore basic shared memory pointers */
> >>> if (UsedShmemSegAddr != NULL)
> >>> + {
> >>> InitShmemAllocator(UsedShmemSegAddr);
> >>> + ShmemCallRequestCallbacks();
> >>>
> >>> It's not clear how we keep the list of registered callbacks across the
> >>> backends and also after restart in-sync. How do we make sure that the
> >>> callbacks registered at this time are the same callbacks registered
> >>> before creating the shared memory? How do we make sure that the
> >>> callbacks registered after the startup are also registered after
> >>> restart?
> >>
> >> On Unix systems, the registered callbacks are inherited by fork(), and
> >> also survive over crash restart. With EXEC_BACKEND, the assumption is
> >> that calling a library's _PG_init() function will register the same
> >> callbacks every time. We make the same assumption today with the
> >> shmem_startup hook.
> >
> > RegisterShmemCallbacks() may be called after the startup, and it will
> > add new areas to the shared memory. How are those registries synced
> > across the backends? From your answer below, those registries are not
> > synced across backends. They will be wiped out by the restart and
> > won't be registered again. Is that right? I think we need to document
> > this fact and also the need to call RegisterShmemCallbacks() from all
> > the backends where the new areas are required after the startup.
>
> Correct. Ok, I'll add a note to comment on RegisterShmemCallbacks() to
> call that out more explicitly, hope it helps.
>
> - Heikki
>
Continuing review starting
0007
-------
Subject: [PATCH v8 07/16] Add test module to test after-startup shmem
allocations
I like the idea.
+ *
+ * XXX This module provides interface functions for C functionality to SQL, to
+ * make it possible to test AIO related behavior in a targeted way from SQL.
+ * It'd not generally be safe to export these functions to SQL, but for a test
+ * that's fine.
This mentions test_aio - needs to be rewritten for test_shmem.
+
+#include "access/relation.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
I don't think we need access/relation.h. Others seem ok, but I haven't checked.
In order to better test the difference between EXEC_BACKEND and
non-EXEC_BACKEND builds, please consider incorporating the attached
patch v8-0009-edits.diff
0008
------
- LWLockRelease(AddinShmemInitLock);
+ /* The hash table must be initialized already */
+ Assert(pgss_hash != NULL);
Does it make sense to also Assert(pgss)? A broader question is do we
want to make it a pattern that every user of ShmemRequest*() also
Assert()s that the pointer is non-NULL in the init callback? It is a
test that the ShmemRequest*(), which is far from, init_fn is working
correctly.
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
-
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
Given that the structures are registered only at the startup, this
function will be called only from Postmaster, but given that the
structures can be registered and initialized after startup in any
backend, it's better to at least Assert(!IsUnderPostmaster) at the
beginning of this function. The code below is not expected to be
called in any backend too. So Assert(IsUnderPostmaster) at the
beginning of the function would be good safety catch too.
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
I was a bit worried that the code next to read stat files is being
crammed in init_fn, but given that the contents of the files are used
to initialize the shared hash table, I think this is fine.
0009
-------
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ const ShmemCallbacks *builtin_subsystems[] = {
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) &subsystem_callbacks,
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+ };
+
+ for (int i = 0; i < lengthof(builtin_subsystems); i++)
+ RegisterShmemCallbacks(builtin_subsystems[i]);
+}
+
I don't think we need to use a separate array here, we can just call
RegisterShmemCallbacks() directly in the macro as attached.
0011
------
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_short_read",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
Attaching and loading an injection point shouldn't be part of the
shared memory initialization. It doens't feel like it should be part
of shmem_startup_hook as well. So not a fault of this patch. I am
wondering why can't it be done in the tests themselves?
0012
------
@@ -663,6 +663,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
Shouldn't this be part of the previous patch?
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
I have reviewed most of this patch in earlier versions of this
patchset except this part, which is better than its last version.
Will continue to review the rest of the patches tomorrow.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] v8-0007-edits.diff.nocibot (1.5K, 2-v8-0007-edits.diff.nocibot)
download
[application/octet-stream] v8-0009-edits.diff.nocibot (697B, 3-v8-0009-edits.diff.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-01 18:17 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-01 18:17 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Yet another version attached (also available at:
https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-9). The
main change is the shape of the ShmemRequest*() calls:
On 27/03/2026 02:51, Heikki Linnakangas wrote:
> Another idea is to use a macro to hide that from pgindent, which would
> make the calls little less verbose anyway:
>
> #define ShmemRequestStruct(desc, ...) ShmemRequestStructWithOpts(desc,
> &(ShmemRequestStructOpts) { __VA_ARGS__ })
>
> Then the call would be simply:
>
> ShmemRequestStruct(&pgssSharedStateDesc,
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> );
I went with that approach. We're already doing something similar with
XL_ROUTINE in xlogreader.h:
#define XL_ROUTINE(...) &(XLogReaderRoutine){__VA_ARGS__}
The calls look like this:
xlogreader =
XLogReaderAllocate(wal_segment_size, NULL,
XL_ROUTINE(.page_read = &XLogPageRead,
.segment_open = NULL,
.segment_close = wal_segment_close),
private);
If we followed that example, ShmemRequestStruct() calls would look like
this:
ShmemRequestStruct(&pgssSharedStateDesc,
SHMEM_STRUCT_OPTS(.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
);
However, I don't like the deep indentation, it feels like the important
stuff is buried to the right. And pgindent insists on that. So I went
with the proposal I quoted above, turning ShmemRequestStruct(...) itself
into a macro. If you need more complex options setup, you can set up the
struct without the macro and call ShmemRequestStructWithOpts() directly,
but so far all of the callers can use the macro.
Ashutosh, I think I've addressed most of your comments so far. I'm
replying to just a few of them here that might need more discussion:
>
> +} shmem_startup_state;
>
> This isn't just startup state since the backend can toggle between
> DONE and LATE_ATTACH_OR_INIT states after the startup. Probably
> "shmem_state" would be a better name.
Renamed to "shmem_request_state". And renamed "LATE_ATTACH_OR_INIT" to
"AFTER_STARTUP_ATTACH_OR_INIT" to match the terminology I used elsewhere.
I'm still not entirely happy with this state machine. It seems useful to
have it for sanity checking, but it still feels a little unclear what
state you're in at different points in the code, and as an aesthetic
thing, the whole enum feels too prominent given that it's just for
sanity checks.
> + ShmemStructDesc *desc = area->desc;
> +
> + AttachOrInit(desc, false, true);
> + }
> + list_free(requested_shmem_areas);
> + requested_shmem_areas = NIL;
>
> If we pop all the nodes from the list, then the list should be NIL
> right? Why do we need to free it?
>
> + else if (!init_allowed)
> + {
>
> For the sake of documentation and sanity, I would add
> Assert(!index_entry) here, possibly with a comment. Otherwise it feels
> like we might be leaving a half-initialized entry in the hash table.
>
> What if attach_allowed is false and the entry is not found? Should we
> throw an error in that case too? It would be foolish to call
> AttachOrInit with both init_allowed and attach_allowed set to false,
> but the API allows it and we should check for that.
>
> It feels like we should do something about the arguments. The function
> is hard to read. init_allowed is actually the action the caller wants
> to take if the entry is not found, and attach_allowed is the action
> the caller wants to take if the entry is found.
>
> Also explain in the comment what does attach mean here especially in
> case of fixed sized structures.
I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init'
and 'may_attach'. But more importantly I added comments to explain the
different usages. Hope that helps..
On 01/04/2026 14:59, Ashutosh Bapat wrote:
> 0008
> ------
> - LWLockRelease(AddinShmemInitLock);
> + /* The hash table must be initialized already */
> + Assert(pgss_hash != NULL);
>
> Does it make sense to also Assert(pgss)? A broader question is do we
> want to make it a pattern that every user of ShmemRequest*() also
> Assert()s that the pointer is non-NULL in the init callback? It is a
> test that the ShmemRequest*(), which is far from, init_fn is working
> correctly.
The function does a lot of accesses of 'pgss' so if that's NULL you'll
get a crash pretty quickly. I'm not sure if the Assert(pgss_hash !=
NULL) is really needed either, but I'm inclined to keep it, as pgss_hash
might not otherwise be accessed in the function, and there are runtime
checks for it in the other functions, so if it's not initialized for
some reason, things might still appear to work to some extent. I don't
think I want to have that as a broader pattern though.
> + /*
> + * Extra space to reserve in the shared memory segment, but it's not part
> + * of the struct itself. This is used for shared memory hash tables that
> + * can grow beyond the initial size when more buckets are allocated.
> + */
> + size_t extra_size;
>
> When we introduce resizable structures (where even the hash table
> directly itself could be resizable), we will introduce a new field
> max_size which is easy to get confused with extra_size. Maybe we can
> rename extra_size to something like "auxilliary_size" to mean size of
> the auxiliary parts of the structure which are not part of the main
> struct itself.
>
> + /*
> + * max_size is the estimated maximum number of hashtable entries. This is
> + * not a hard limit, but the access efficiency will degrade if it is
> + * exceeded substantially (since it's used to compute directory size and
> + * the hash table buckets will get overfull).
> + */
> + size_t max_size;
> +
> + /*
> + * init_size is the number of hashtable entries to preallocate. For a
> + * table whose maximum size is certain, this should be equal to max_size;
> + * that ensures that no run-time out-of-shared-memory failures can occur.
> + */
> + size_t init_size;
>
> Everytime I look at these two fields, I question whether those are the
> number of entries (i.e. size of the hash table) or number of bytes
> (size of the memory). I know it's the former, but it indicates that
> something needs to be changed here, like changing the names to have
> _entries instead of _size, or changing the type to int64 or some such.
> Renaming to _entries would conflict with dynahash APIs since they use
> _size, so maybe the latter?
I hear you, but I didn't change these yet. If we go with the patches
from the "Shared hash table allocations" thread, max_size and init_size
will be merged into one. I'll try to settle that thread before making
changes here.
> /*
> - * If we're in the postmaster (or a standalone backend...), set up a shmem
> - * exit hook to dump the statistics to disk.
> + * Set up a shmem exit hook to dump the statistics to disk on postmaster
> + * (or standalone backend) exit.
> */
> - if (!IsUnderPostmaster)
> - on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
> -
> - /*
> - * Done if some other process already completed our initialization.
> - */
> - if (found)
> - return;
> + on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
> Given that the structures are registered only at the startup, this
> function will be called only from Postmaster, but given that the
> structures can be registered and initialized after startup in any
> backend, it's better to at least Assert(!IsUnderPostmaster) at the
> beginning of this function. The code below is not expected to be
> called in any backend too. So Assert(IsUnderPostmaster) at the
> beginning of the function would be good safety catch too.
Ok, added an assertion.
> /*
> + * Load any pre-existing statistics from file.
> + *
> * Note: we don't bother with locks here, because there should be no other
> * processes running when this code is reached.
> */
>
> I was a bit worried that the code next to read stat files is being
> crammed in init_fn, but given that the contents of the files are used
> to initialize the shared hash table, I think this is fine.
Yeah, I went through that train of thought too. Loading the file into
the hash table is a kind of initialization.
> 0009
> -------
> +void
> +RegisterBuiltinShmemCallbacks(void)
> +{
> + const ShmemCallbacks *builtin_subsystems[] = {
> +#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) &subsystem_callbacks,
> +#include "storage/subsystemlist.h"
> +#undef PG_SHMEM_SUBSYSTEM
> + };
> +
> + for (int i = 0; i < lengthof(builtin_subsystems); i++)
> + RegisterShmemCallbacks(builtin_subsystems[i]);
> +}
> +
>
> I don't think we need to use a separate array here, we can just call
> RegisterShmemCallbacks() directly in the macro as attached.
Ah, clever.
> 0011
> ------
> + InjectionPointAttach("aio-process-completion-before-shared",
> + "test_aio",
> + "inj_io_short_read",
> + NULL,
> + 0);
> + InjectionPointLoad("aio-process-completion-before-shared");
> +
> + InjectionPointAttach("aio-worker-after-reopen",
> + "test_aio",
> + "inj_io_reopen",
> + NULL,
> + 0);
> + InjectionPointLoad("aio-worker-after-reopen");
>
> Attaching and loading an injection point shouldn't be part of the
> shared memory initialization. It doens't feel like it should be part
> of shmem_startup_hook as well. So not a fault of this patch. I am
> wondering why can't it be done in the tests themselves?
I think it's the same reason that's explained in the comment in
test_aio_shmem_attach():
> /*
> * Pre-load the injection points now, so we can call them in a critical
> * section.
> */
> #ifdef USE_INJECTION_POINTS
> InjectionPointLoad("aio-process-completion-before-shared");
> InjectionPointLoad("aio-worker-after-reopen");
> elog(LOG, "injection point loaded");
> #endif
> -void
> -InitProcGlobal(void)
> +static void
> +ProcGlobalShmemInit(void *arg)
> {
I'm not sure what you meant to say here, but I did notice that there
were a bunch of references to InitProcGlobal() left over in comments.
Fixed those.
- Heikki
Attachments:
[text/x-patch] v9-0001-Test-pg_stat_statements-across-crash-restart.patch (2.6K, 2-v9-0001-Test-pg_stat_statements-across-crash-restart.patch)
download | inline diff:
From 56ba0a729b0807eafe7018c7807553740bff957a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 01:45:16 +0200
Subject: [PATCH v9 01/16] Test pg_stat_statements across crash restart
Add 'pg_stat_statements' to the crash restart test, to test that
shared memory and LWLock initialization works across crash restart in
a library listed in shared_preload_libraries.
Reviewed-by: Ashutosh Bapat <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/test/recovery/t/013_crash_restart.pl | 33 +++++++++++++++++++++---
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index 20d648ad6af..56afb1aa6eb 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -21,14 +21,32 @@ my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
my $node = PostgreSQL::Test::Cluster->new('primary');
$node->init(allows_streaming => 1);
+
+# Enable pg_stat_statements to test restart of shared_preload_libraries.
+$node->append_conf(
+ 'postgresql.conf',
+ qq{shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.max = 50000
+compute_query_id = 'regress'
+});
+
$node->start();
# by default PostgreSQL::Test::Cluster doesn't restart after a crash
$node->safe_psql(
- 'postgres',
- q[ALTER SYSTEM SET restart_after_crash = 1;
- ALTER SYSTEM SET log_connections = receipt;
- SELECT pg_reload_conf();]);
+ 'postgres', q[
+ ALTER SYSTEM SET restart_after_crash = 1;
+ ALTER SYSTEM SET log_connections = receipt;
+ SELECT pg_reload_conf();
+ ]);
+
+# Remember the time that pg_stat_statements was reset. We'll use it later to
+# verify that it gets re-initialized after crash.
+my $stats_reset = $node->safe_psql(
+ 'postgres', q[
+ CREATE EXTENSION pg_stat_statements;
+ SELECT stats_reset FROM pg_stat_statements_info;
+ ]);
# Run psql, keeping session alive, so we have an alive backend to kill.
my ($killme_stdin, $killme_stdout, $killme_stderr) = ('', '', '');
@@ -141,6 +159,13 @@ $killme->run();
($monitor_stdin, $monitor_stdout, $monitor_stderr) = ('', '', '');
$monitor->run();
+# Verify that pg_stat_statements, loaded via shared_preload_libraries,
+# was re-initialized at the crash.
+my $stats_reset_after = $node->safe_psql('postgres',
+ q[SELECT stats_reset FROM pg_stat_statements_info]);
+cmp_ok($stats_reset, 'ne', $stats_reset_after,
+ "pg_stat_statements was reset by restart");
+
# Acquire pid of new backend
$killme_stdin .= q[
--
2.47.3
[text/x-patch] v9-0002-refactor-Move-ShmemInitHash-to-separate-file.patch (9.1K, 3-v9-0002-refactor-Move-ShmemInitHash-to-separate-file.patch)
download | inline diff:
From 1c61f20ae4bf45d1f918819ffd2f92efedc39448 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH v9 02/16] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 77 ----------------------
src/backend/storage/ipc/shmem_hash.c | 98 ++++++++++++++++++++++++++++
src/include/storage/shmem.h | 1 +
5 files changed, 101 insertions(+), 77 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 0e49debaaac..1485f1ff38f 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -100,7 +100,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -235,15 +234,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/* Alloc callback for shared memory hash tables */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- Size allocated_size;
-
- return ShmemAllocRaw(size, &allocated_size);
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -305,73 +295,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * max_size is the estimated maximum number of hashtable entries. This is
- * not a hard limit, but the access efficiency will degrade if it is
- * exceeded substantially (since it's used to compute directory size and
- * the hash table buckets will get overfull).
- *
- * init_size is the number of hashtable entries to preallocate. For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 init_size, /* initial table size */
- int64 max_size, /* max size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- void *location;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory; it can't
- * grow or other backends wouldn't be able to find it. So, make sure we
- * make it big enough to start with.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
-
- /* look it up in the shmem index */
- location = ShmemInitStruct(name,
- hash_get_shared_size(infoP, hash_flags),
- &found);
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (found)
- hash_flags |= HASH_ATTACH;
-
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
-
- return hash_create(name, init_size, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..47166ff301d
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,98 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ void *location;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* look it up in the shmem index */
+ location = ShmemInitStruct(name,
+ hash_get_shared_size(infoP, hash_flags),
+ &found);
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (found)
+ hash_flags |= HASH_ATTACH;
+
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+
+ return hash_create(name, init_size, infoP, hash_flags);
+}
+
+/* Alloc callback for shared memory hash tables */
+void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ return ShmemAllocNoError(size);
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 2a9e9becd26..d26206131d7 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -31,6 +31,7 @@ typedef struct PGShmemHeader PGShmemHeader; /* avoid including
extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
+extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
--
2.47.3
[text/x-patch] v9-0003-refactor-predicate.c-inline-SerialInit-to-the-cal.patch (3.6K, 4-v9-0003-refactor-predicate.c-inline-SerialInit-to-the-cal.patch)
download | inline diff:
From 49eefa0a209e326652ea02402d317c85b15ed52b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH v9 03/16] refactor predicate.c: inline SerialInit to the
caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index ae0e96aee5f..40950ee3a4f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1358,7 +1315,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.47.3
[text/x-patch] v9-0004-refactor-predicate.c-Move-all-the-initialization-.patch (8.3K, 5-v9-0004-refactor-predicate.c-Move-all-the-initialization-.patch)
download | inline diff:
From 9f2ef13cca83e6671756bc9f21a416c1318acd1f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH v9 04/16] refactor predicate.c: Move all the initialization
together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 40950ee3a4f..4f80fc73639 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1145,19 +1145,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1202,49 +1189,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1281,23 +1225,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1308,8 +1235,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1331,19 +1256,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.47.3
[text/x-patch] v9-0005-Introduce-a-new-mechanism-for-registering-shared-.patch (60.1K, 6-v9-0005-Introduce-a-new-mechanism-for-registering-shared-.patch)
download | inline diff:
From ba96196891beefe75842f62c397628c179d3811c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:01:39 +0300
Subject: [PATCH v9 05/16] Introduce a new mechanism for registering shared
memory areas
Each shared memory area is registered with a "descriptor struct" that
contains parameters like name and size of the area. The descriptor
struct makes it easier to add optional fields in the future; the
additional fields can just be left as zeros.
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register the subsystem's shared memory needs. The
registration includes the size, which replaces the
[Subsystem]ShmemSize() calls, and a pointer to an initialization
callback function, which replaces the [Subsystem]ShmemInit()
calls. This is more ergonomic, as you only need to calculate the size
once, when you register the struct.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 1 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 18 +-
src/backend/storage/ipc/ipci.c | 29 +-
src/backend/storage/ipc/shmem.c | 802 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 82 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 9 +-
src/include/storage/shmem.h | 224 ++++++-
src/test/modules/test_aio/test_aio.c | 1 -
src/tools/pgindent/typedefs.list | 9 +-
13 files changed, 1154 insertions(+), 194 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 38ef683d4c7..ca75afe8cc7 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -372,6 +372,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..75423104be8 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +673,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index abf0c97569e..91e1657bcf4 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -951,7 +951,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3227,7 +3234,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index d692d419846..34acaeeefe0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -99,8 +99,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -174,6 +175,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -218,7 +226,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -249,11 +267,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 1485f1ff38f..12d06299dc8 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,48 +19,115 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size, but
- * their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted. However,
- * if one hash table grows very large and then shrinks, its space
- * cannot be redistributed to other tables. We could build a simple
- * hash bucket garbage collector if need be. Right now, it seems
- * unnecessary.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -79,6 +146,76 @@
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructDesc *desc;
+ ShmemStructOpts *options;
+ ShmemAreaKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -109,25 +246,394 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static bool AttachOrInitShmemIndexEntry(ShmemRequest *request,
+ bool may_init, bool may_attach);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
+/*
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(ShmemStructDesc *desc, const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(desc, options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructDesc *desc, ShmemStructOpts *options,
+ ShmemAreaKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemStructDesc, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->desc = desc;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ size = add_size(size, request->options->extra_size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachOrInitShmemIndexEntry(request, true, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachOrInitShmemIndexEntry(request, false, true);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Workhorse to insert or look up a named shmem area in the shared memory
+ * index, and initialize or attach to it.
+ *
+ * Note that this only does the basic initialization depending ShmemAreaKind,
+ * like setting the global pointer variable to the area for SHMEM_KIND_STRUCT
+ * or setting up the backend-private HTAB control struct. This does *not*
+ * call the callbacks specific to the subsystem that requested it. That's
+ * done later after all the shmem areas have been initialized or attached to.
+ *
+ * may_init == true && may_attach == false is used at postmaster startup to
+ * allocate all the areas. An error is thrown if the area already exists.
+ *
+ * may_init == false && may_attach == true is used at backend startup in
+ * EXEC_BACKEND mode to attach to all the areas. The area is expected to
+ * already be initialized, an error is thrown if not.
+ *
+ * may_init == true && may_attach == true is used when a shared memory is
+ * requested after startup, with SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP.
+ *
+ * (may_init == false && may_attach == false is not used, it would always
+ * raise an error)
+ */
+static bool
+AttachOrInitShmemIndexEntry(ShmemRequest *request,
+ bool may_init, bool may_attach)
+{
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ ShmemStructDesc *desc = request->desc;
+ ShmemIndexEnt *index_entry;
+ bool found;
+
+ /* If both are false, we'll fail no matter what */
+ Assert(may_init || may_attach);
+
+ desc->name = request->options->name;
+ desc->ptr = NULL;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, request->options->name,
+ may_init ? HASH_ENTER_NULL : HASH_FIND, &found);
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (!may_attach)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", desc->name);
+
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ }
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(desc, request->options);
+ break;
+ }
+ }
+ else if (!may_init)
+ {
+ /* attach was requested, but it was not found */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else
+ {
+ /*
+ * We inserted the entry to the shared memory index. Allocate
+ * requested amount of shared memory for it, and do basic
+ * initializion.
+ */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(desc, request->options);
+ break;
+ }
+ }
+
+ return found;
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_size;
HASHCTL info;
int hash_flags;
size_t size;
@@ -137,6 +643,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -181,9 +697,11 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_size = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
- info.dsize = info.max_dsize = hash_select_dirsize(SHMEM_INDEX_SIZE);
+ info.dsize = info.max_dsize = hash_select_dirsize(hash_size);
info.alloc = ShmemHashAlloc;
info.alloc_arg = NULL;
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
@@ -195,10 +713,27 @@ InitShmemAllocator(PGShmemHeader *seghdr)
else
hash_flags |= HASH_ATTACH;
info.hctl = ShmemAllocator->index;
- ShmemIndex = hash_create("ShmemIndex", SHMEM_INDEX_SIZE, &info, hash_flags);
+ ShmemIndex = hash_create("ShmemIndex", hash_size, &info, hash_flags);
Assert(ShmemIndex != NULL);
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -296,92 +831,141 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /* After-startup initialization */
+ bool found = false;
- Assert(ShmemIndex != NULL);
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ Assert(pending_shmem_requests == NIL);
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- if (*foundPtr)
- {
/*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
*/
- if (result->size != size)
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
+ found = AttachOrInitShmemIndexEntry(request, true, true);
}
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
+ /*
+ * Finish initialization or attaching to the shmem ares by calling the
+ * appropriate callback.
+ *
+ * FIXME: What to do if multiple shmem areas were requested, and some
+ * of them are already initialized but not all? We expect all shmem
+ * areas requested by a single callback to form a coherent unit.
+ */
+ if (found)
{
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+ else
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+
+ LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+
+ return;
}
- LWLockRelease(ShmemIndexLock);
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+}
+
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(ShmemAddrIsValid(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ ShmemStructDesc desc;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ };
+ ShmemRequest request = {&desc, &options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ /* look it up immediately */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ *foundPtr = AttachOrInitShmemIndexEntry(&request, true, true);
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(desc.ptr != NULL);
+ return desc.ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 47166ff301d..c0646b75994 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -21,6 +21,83 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "utils/memutils.h"
+
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(ShmemHashDesc *desc, const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+ int64 dirsize;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ dirsize = hash_select_dirsize(options->max_size);
+ options_copy->hash_info.dsize = dirsize;
+ options_copy->hash_info.max_dsize = dirsize;
+ options_copy->hash_info.alloc = ShmemHashAlloc;
+ options_copy->hash_info.alloc_arg = NULL;
+ options_copy->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_get_shared_size(&options_copy->hash_info,
+ options_copy->hash_flags);
+
+ /* Reserve extra space for the buckets */
+ options_copy->base.extra_size =
+ hash_estimate_size(options->max_size, options_copy->hash_info.entrysize) - options_copy->base.size;
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = hash_create(desc->base.name, options->init_size, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
+
+void
+shmem_hash_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = hash_create(desc->base.name, options->init_size, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
/*
@@ -46,9 +123,8 @@
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..af7cc86d80a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4155,7 +4155,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index d26206131d7..f3850bc7b08 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,18 +24,227 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemAreaKind;
+
+/*
+ * ShmemStructDesc is backend-private handle for a shared memory area
+ * requested with ShmemRequestStruct().
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name and size of the shared memory area. */
+ const char *name;
+
+ void *ptr;
+ size_t size;
+} ShmemStructDesc;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
+
+ ssize_t size;
+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+/*
+ * Backend-private handle for a named shared memory hash table, similar to
+ * ShmemStructDesc.
+ */
+typedef struct ShmemHashDesc
+{
+ /*
+ * Descriptor of the underlying fixed-size allocated area where the hash
+ * table lives.
+ */
+ ShmemStructDesc base;
+
+ HTAB *ptr;
+} ShmemHashDesc;
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * ALLOW_AFTER_STARTUP: Allow these shared memory usages to be registered
+ * after postmaster startup. Normally, registering a shared memory system
+ * after postmaster startup is not allowed e.g. in an add-in library loaded
+ * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
+ * called immediately after registration, to initialize or attach to the
+ * requested shared memory areas. This is not used by any built-in
+ * subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
/* shmem.c */
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
+
+extern void ShmemRequestInternal(ShmemStructDesc *desc, ShmemStructOpts *options,
+ ShmemAreaKind kind);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(desc, ...) \
+ ShmemRequestStructWithOpts(desc, &(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(desc, ...) \
+ ShmemRequestHashWithOpts(desc, &(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(ShmemStructDesc *desc, const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(ShmemHashDesc *desc, const ShmemHashOpts *options);
+extern void ShmemCallRequestCallbacks(void);
+
+/* legacy shmem allocation functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
+extern void shmem_hash_init(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+extern void shmem_hash_attach(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
@@ -44,19 +253,4 @@ extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* estimated size of the shmem index table (not a hard limit) */
-#define SHMEM_INDEX_SIZE (64)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..34487a05486 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -77,7 +77,6 @@ static InjIoErrorState *inj_io_error_state;
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-
static PgAioHandle *last_handle;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5bc517602b1..880417df235 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2854,9 +2854,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemAreaKind
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v9-0006-Add-test-module-to-test-after-startup-shmem-alloc.patch (10.2K, 7-v9-0006-Add-test-module-to-test-after-startup-shmem-alloc.patch)
download | inline diff:
From c8de885d6c10614ffa2c2096cc91192a5f70d763 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:10:31 +0300
Subject: [PATCH v9 06/16] Add test module to test after-startup shmem
allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 ++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 ++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 105 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 226 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 28ce3b35eda..62fab9f3c2f 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -47,6 +47,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 3ac291656c1..6799ba11e11 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -48,6 +48,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..31c735be570
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,105 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ static ShmemStructDesc TestShmemDesc;
+
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(&TestShmemDesc,
+ .name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem,
+ );
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 880417df235..a997d3a5f54 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3137,6 +3137,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.47.3
[text/x-patch] v9-0007-Convert-pg_stat_statements-to-use-the-new-interfa.patch (11.4K, 8-v9-0007-Convert-pg_stat_statements-to-use-the-new-interfa.patch)
download | inline diff:
From 6cfa457e7c02857ebd7b10b8904a93f5d6fde299 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:24 +0300
Subject: [PATCH v9 07/16] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 179 ++++++++----------
1 file changed, 83 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 7975476b890..124979b6dcb 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,48 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ static ShmemHashDesc pgssSharedHashDesc;
+ static ShmemStructDesc pgssSharedStateDesc;
+
+ ShmemRequestHash(&pgssSharedHashDesc,
+ .name = "pg_stat_statements hash",
+ .ptr = &pgss_hash,
+ .init_size = pgss_max,
+ .max_size = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestStruct(&pgssSharedStateDesc,
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +550,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1340,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1361,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1380,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1520,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1809,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2047,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2087,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2717,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2811,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v9-0008-Introduce-registry-of-built-in-subsystems.patch (6.6K, 9-v9-0008-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From cda9fcc2b158b39e721b1c65426fedcb2a5b9aac Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:02 +0300
Subject: [PATCH v9 08/16] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
8 files changed, 87 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index ca75afe8cc7..e07ba5c0ae3 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -361,6 +361,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 75423104be8..7b81200d3c2 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -663,6 +663,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 91e1657bcf4..4cf5774e33b 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -921,6 +921,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 34acaeeefe0..3d70cb76ad6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -249,6 +250,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index af7cc86d80a..6c6c2243d9e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4137,6 +4137,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
--
2.47.3
[text/x-patch] v9-0009-Convert-injection-points-to-use-the-new-interface.patch (4.6K, 10-v9-0009-Convert-injection-points-to-use-the-new-interface.patch)
download | inline diff:
From 0639ce8e14074fb8bc34a7135bc9cad9086800ec Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH v9 09/16] Convert injection points to use the new interface
---
src/backend/storage/ipc/ipci.c | 3 --
src/backend/utils/misc/injection_point.c | 60 ++++++++++++------------
src/include/storage/subsystemlist.h | 2 +-
src/include/utils/injection_point.h | 3 --
4 files changed, 30 insertions(+), 38 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 3d70cb76ad6..2ae4b9c2ac2 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,7 +52,6 @@
#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
#include "utils/wait_event.h"
/* GUCs */
@@ -139,7 +138,6 @@ CalculateShmemSize(void)
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
@@ -353,7 +351,6 @@ CreateOrAttachShmemStructs(void)
AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
- InjectionPointShmemInit();
AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..9981d6e212f 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,46 +232,38 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ static ShmemStructDesc InjectionPointShmemDesc;
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+ ShmemRequestStruct(&InjectionPointShmemDesc,
+ .name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
}
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
+}
+#endif
+
/*
* Attach a new injection point.
*/
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..65da6f17c5d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,4 @@
* of these matter.
*/
-/* TODO: empty for now */
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
--
2.47.3
[text/x-patch] v9-0010-Convert-wait_event.c-to-new-interface.patch (6.3K, 11-v9-0010-Convert-wait_event.c-to-new-interface.patch)
download | inline diff:
From 422a6106f6aed5fa36a39a00c7be8fe447f08133 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 28 Mar 2026 01:46:46 +0200
Subject: [PATCH v9 10/16] Convert wait_event.c to new interface
---
src/backend/storage/ipc/ipci.c | 2 -
src/backend/utils/activity/wait_event.c | 96 ++++++++++++-------------
src/include/storage/subsystemlist.h | 1 +
src/include/utils/wait_event.h | 2 -
4 files changed, 48 insertions(+), 53 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2ae4b9c2ac2..b90717c1f58 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -137,7 +137,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
@@ -350,7 +349,6 @@ CreateOrAttachShmemStructs(void)
SyncScanShmemInit();
AsyncShmemInit();
StatsShmemInit();
- WaitEventCustomShmemInit();
AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index e5a2289f0b0..08af61ff448 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -97,61 +98,58 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ static ShmemStructDesc WaitEventCustomCounterShmemDesc;
+ static ShmemHashDesc WaitEventCustomHashByInfoDesc;
+ static ShmemHashDesc WaitEventCustomHashByNameDesc;
+
+ ShmemRequestStruct(&WaitEventCustomCounterShmemDesc,
+ .name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(&WaitEventCustomHashByInfoDesc,
+ .name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(&WaitEventCustomHashByNameDesc,
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+
+ .init_size = WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
+ .max_size = WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_INIT_SIZE,
- WAIT_EVENT_CUSTOM_HASH_MAX_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 65da6f17c5d..7dfbd03d6e5 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,5 @@
* of these matter.
*/
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
--
2.47.3
[text/x-patch] v9-0011-Convert-test_aio-to-use-the-new-mechanism.patch (4.9K, 12-v9-0011-Convert-test_aio-to-use-the-new-mechanism.patch)
download | inline diff:
From e2668cbe66010b2d7ad326c448d0300a4ee151d7 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:50:04 +0300
Subject: [PATCH v9 11/16] Convert test_aio to use the new mechanism
I wanted to use this to showcase SHMEM_ALLOW_ALLOC_AFTER_STARTUP, but
unfortunately that didn't work because then we'd call
InjectionPointLoad from _PG_init().
---
src/test/modules/test_aio/test_aio.c | 108 +++++++++++++--------------
1 file changed, 50 insertions(+), 58 deletions(-)
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index 34487a05486..8fa51a7dd02 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,78 +74,73 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc inj_io_shmem_desc;
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(&inj_io_shmem_desc,
+ .name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -154,10 +149,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.47.3
[text/x-patch] v9-0012-Use-the-new-mechanism-in-a-few-core-subsystems.patch (39.9K, 13-v9-0012-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From e8d3509ad2c4dff5159ba1dc2b64a21820aec7e7 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:07:30 +0300
Subject: [PATCH v9 12/16] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 36 ++---
src/backend/storage/ipc/dsm.c | 65 +++++----
src/backend/storage/ipc/dsm_registry.c | 39 +++---
src/backend/storage/ipc/ipci.c | 28 ----
src/backend/storage/ipc/pmsignal.c | 57 ++++----
src/backend/storage/ipc/procarray.c | 119 ++++++++--------
src/backend/storage/ipc/procsignal.c | 66 ++++-----
src/backend/storage/ipc/sinvaladt.c | 40 +++---
src/backend/storage/lmgr/proc.c | 183 +++++++++++++------------
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 18 +++
19 files changed, 333 insertions(+), 338 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..2ea8d088c0e 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,28 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc TransamVariablesShmemDesc;
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(&TransamVariablesShmemDesc,
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..923593d140d 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -110,6 +111,14 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -463,43 +472,45 @@ dsm_set_control_handle(dsm_handle h)
}
#endif
+static ShmemStructDesc dsm_main_space_shmem_desc;
+
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ size_t size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (size == 0)
+ return;
+
+ ShmemRequestStruct(&dsm_main_space_shmem_desc,
+ .name = "Preallocated DSM",
+ .size = size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..3e9c0ba2947 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,23 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ static ShmemStructDesc DSMRegistryCtxShmemDesc;
+
+ ShmemRequestStruct(&DSMRegistryCtxShmemDesc,
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index b90717c1f58..0dc3a2146ec 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -41,18 +40,13 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -102,14 +96,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -119,11 +109,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -285,13 +271,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -314,23 +296,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..00588664885 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,31 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
-
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
-
- return size;
+ static ShmemStructDesc PMSignalShmemDesc;
+ size_t size;
+
+ num_child_flags = MaxLivePostmasterChildren();
+
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(&PMSignalShmemDesc,
+ .name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..c8e09f4f0b6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,20 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
+static ShmemStructDesc ProcArrayShmemDesc;
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +284,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +294,15 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc;
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +393,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +418,52 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(&KnownAssignedXidsShmemDesc,
+ .name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(&KnownAssignedXidsValidShmemDesc,
+ .name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(&ProcArrayShmemDesc,
+ .name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b..7c79727e308 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -32,6 +32,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -105,7 +106,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -113,51 +123,41 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
+ static ShmemStructDesc ProcSignalShmemDesc;
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(&ProcSignalShmemDesc,
+ .name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..34860d474bc 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,37 +221,32 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
+ static ShmemStructDesc SharedInvalShmemDesc;
Size size;
size = offsetof(SISeg, procState);
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(&SharedInvalShmemDesc,
+ .name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..b5532208364 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,25 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalShmemDesc;
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc;
+static ShmemStructDesc FastPathLockArrayShmemDesc;
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -82,24 +99,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -107,8 +106,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +125,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +138,63 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ShmemRequestStruct(&ProcGlobalAllProcsShmemDesc,
+ .name = "PGPROC structures",
+ .size = size,
+ .ptr = (void **) &tmpAllProcs,
+ );
+
+ ShmemRequestStruct(&FastPathLockArrayShmemDesc,
+ .name = "Fast-Path Lock Array",
+ .size = IsUnderPostmaster ? SHMEM_ATTACH_UNKNOWN_SIZE : FastPathLockShmemSize(),
+ .ptr = (void **) &tmpFastPathLockArray,
+ );
+
+ ShmemRequestStruct(&ProcGlobalShmemDesc,
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ .ptr = (void **) &ProcGlobal,
+ );
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +213,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +242,12 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +256,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,24 +271,21 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
@@ -405,7 +412,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +467,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +600,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +669,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 1dad125706e..60732ccb33a 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -551,8 +551,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a93..031897015f4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -63,9 +63,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 7dfbd03d6e5..5c11b2b3499 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,5 +20,23 @@
* of these matter.
*/
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
--
2.47.3
[text/x-patch] v9-0013-Convert-SLRUs-to-use-the-new-interface.patch (85.0K, 14-v9-0013-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From c2c12933ef400efd729fb4b9d3f0c86d7c4c79d5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 26 Mar 2026 20:39:46 +0200
Subject: [PATCH v9 13/16] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 92 +++---
src/backend/access/transam/multixact.c | 140 +++++----
src/backend/access/transam/slru.c | 364 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 117 ++++----
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 299 +++++++++----------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 108 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem.h | 1 +
src/include/storage/subsystemlist.h | 8 +
src/test/modules/test_slru/test_slru.c | 110 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 720 insertions(+), 675 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..87f7f5707de 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(&XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..236d8fb4baa 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,9 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
+
+static ShmemStructDesc CommitTsShmemDesc;
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +128,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +524,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +550,37 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(&CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
+
+ .nslots = CommitTsShmemBuffers(),
+
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
+
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
+
+ ShmemRequestStruct(&CommitTsShmemDesc,
+ .name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
-
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
-
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..940ac5a78d6 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,83 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
+ static ShmemStructDesc MultiXactShmemDesc;
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(&MultiXactShmemDesc,
+ .name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(&MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(&MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..3fe60c5804b 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -71,6 +71,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +90,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +102,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +111,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +177,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +197,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +239,134 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +392,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +445,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +461,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +487,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +545,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +649,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +696,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +776,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +790,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +848,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +920,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1052,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1092,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1107,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1122,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1135,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1214,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1306,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1318,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1367,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1437,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1453,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1475,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1503,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1548,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1571,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1648,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1669,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1688,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1745,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1760,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1776,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1792,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1806,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1839,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1847,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1861,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1879,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..ca273fb4680 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 5c9a56c3d40..9be98afc1d9 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,65 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
+ static ShmemStructDesc AsyncQueueControlShmemDesc;
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(&AsyncQueueControlShmemDesc,
+ .name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(&NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 0dc3a2146ec..1925d3deff9 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -98,16 +98,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
@@ -121,7 +116,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, SlotSyncShmemSize());
size = add_size(size, AioShmemSize());
@@ -277,10 +271,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -288,11 +278,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -319,7 +304,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
AioShmemInit();
WaitLSNShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 12d06299dc8..73297e82265 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -134,6 +134,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -556,6 +557,9 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
case SHMEM_KIND_HASH:
shmem_hash_attach(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(desc, request->options);
+ break;
}
}
else if (!may_init)
@@ -615,6 +619,9 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
case SHMEM_KIND_HASH:
shmem_hash_init(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(desc, request->options);
+ break;
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 4f80fc73639..718e6b07f5c 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,16 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static ShmemStructDesc PredXactListShmemDesc;
+
+static int64 max_serializable_xacts;
+
+static ShmemStructDesc RWConflictPoolShmemDesc;
+
+static ShmemStructDesc FinishedSerializableShmemDesc;
+
+static ShmemStructDesc SerialControlShmemDesc;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +464,18 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
+static ShmemHashDesc SerializableXidHashDesc;
+
+static ShmemHashDesc PredicateLockTargetHashDesc;
+
+static ShmemHashDesc PredicateLockHashDesc;
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,73 +1127,61 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
+ ShmemRequestHash(&PredicateLockTargetHashDesc,
+ .name = "PREDICATELOCKTARGET hash",
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ .init_size = max_predicate_lock_targets,
+ .max_size = max_predicate_lock_targets,
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(&PredicateLockHashDesc,
+ .name = "PREDICATELOCK hash",
+
+ .init_size = max_predicate_locks,
+ .max_size = max_predicate_locks,
+
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1179,30 +1194,32 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&PredXactListShmemDesc,
+ .name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
+ ShmemRequestHash(&SerializableXidHashDesc,
+ .name = "SERIALIZABLEXID hash",
+
+ .init_size = max_serializable_xacts,
+ .max_size = max_serializable_xacts,
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1217,58 +1234,53 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(&RWConflictPoolShmemDesc,
+ .name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&FinishedSerializableShmemDesc,
+ .name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&SerialControlShmemDesc,
+ .name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1280,7 +1292,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1322,6 +1333,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1338,63 +1352,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
- /*
- * Since NPREDICATELOCKTARGETENTS is only an estimate, add 10% safety
- * margin.
- */
- size = add_size(size, size / 10);
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
-
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..820c7986854 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,20 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruOpts
{
- SlruShared shared;
-
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ ShmemStructOpts base;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * name of SLRU. (This is user-visible, pick with care!)
*/
- bool long_segment_names;
+ const char *name;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +128,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +164,28 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
+
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ ShmemStructDesc base;
+
+ SlruOpts options;
-typedef SlruCtlData *SlruCtl;
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +194,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options);
+
+#define SimpleLruRequest(desc, ...) \
+ SimpleLruRequestWithOpts(desc, &(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+extern void shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index f3850bc7b08..64c1fb044c6 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -29,6 +29,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemAreaKind;
/*
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 5c11b2b3499..63d1d60ae36 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -25,6 +25,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -38,5 +45,6 @@ PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..3c2a143b4d5 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,48 +212,6 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
-{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /*
- * Create the SLRU directory if it does not exist yet, from the root of
- * the data directory.
- */
- (void) MakePGDirectory(slru_dir_name);
-
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
-}
-
void
_PG_init(void)
{
@@ -269,9 +221,37 @@ _PG_init(void)
errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
"test_slru")));
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ /*
+ * Create the SLRU directory if it does not exist yet, from the root of
+ * the data directory.
+ */
+ (void) MakePGDirectory(TestSlruDir);
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
+}
+
+static void
+test_slru_shmem_request(void *arg)
+{
+ SimpleLruRequest(&TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
+
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a997d3a5f54..8563d6d2c97 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2892,10 +2892,10 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
SlruPageStatus
+SlruRequestOpts
SlruScanCallback
SlruSegState
SlruShared
--
2.47.3
[text/x-patch] v9-0014-Convert-AIO-to-the-new-interface.patch (32.1K, 15-v9-0014-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From 695db915d5abd7b33cee524412b539fbf418f6ed Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH v9 14/16] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/access/transam/clog.c | 20 ++--
src/backend/access/transam/commit_ts.c | 24 ++---
src/backend/access/transam/multixact.c | 42 ++++----
src/backend/access/transam/slru.c | 8 +-
src/backend/access/transam/subtrans.c | 20 ++--
src/backend/commands/async.c | 30 +++---
src/backend/storage/aio/aio_init.c | 121 ++++++++++++++--------
src/backend/storage/aio/method_io_uring.c | 42 +++++---
src/backend/storage/aio/method_worker.c | 84 ++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/backend/storage/lmgr/predicate.c | 104 +++++++++----------
src/include/access/slru.h | 6 +-
src/include/storage/aio_internal.h | 16 +--
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
src/test/modules/test_slru/test_slru.c | 40 +++----
16 files changed, 303 insertions(+), 263 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 87f7f5707de..95f160879e0 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -810,19 +810,19 @@ CLOGShmemRequest(void *arg)
}
Assert(transaction_buffers != 0);
SimpleLruRequest(&XactSlruDesc,
- .name = "transaction",
- .Dir = "pg_xact",
- .long_segment_names = false,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
- .nslots = CLOGShmemBuffers(),
- .nlsns = CLOG_LSNS_PER_PAGE,
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
- .sync_handler = SYNC_HANDLER_CLOG,
- .PagePrecedes = CLOGPagePrecedes,
- .errdetail_for_io_error = clog_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
- .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
);
}
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 236d8fb4baa..675dac9e40f 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -551,24 +551,24 @@ CommitTsShmemRequest(void *arg)
}
Assert(commit_timestamp_buffers != 0);
SimpleLruRequest(&CommitTsSlruDesc,
- .name = "commit_timestamp",
- .Dir = "pg_commit_ts",
- .long_segment_names = false,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- .nslots = CommitTsShmemBuffers(),
+ .nslots = CommitTsShmemBuffers(),
- .PagePrecedes = CommitTsPagePrecedes,
- .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- .sync_handler = SYNC_HANDLER_COMMIT_TS,
- .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
- .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
);
ShmemRequestStruct(&CommitTsShmemDesc,
- .name = "CommitTs shared",
- .size = sizeof(CommitTimestampShared),
- .ptr = (void **) &commitTsShared,
+ .name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
);
}
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 940ac5a78d6..88e46d6868d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1779,39 +1779,39 @@ MultiXactShmemRequest(void *arg)
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
ShmemRequestStruct(&MultiXactShmemDesc,
- .name = "Shared MultiXact State",
- .size = size,
- .ptr = (void **) &MultiXactState,
+ .name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
);
SimpleLruRequest(&MultiXactOffsetSlruDesc,
- .name = "multixact_offset",
- .Dir = "pg_multixact/offsets",
- .long_segment_names = false,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- .nslots = multixact_offset_buffers,
+ .nslots = multixact_offset_buffers,
- .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
- .PagePrecedes = MultiXactOffsetPagePrecedes,
- .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
- .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
);
SimpleLruRequest(&MultiXactMemberSlruDesc,
- .name = "multixact_member",
- .Dir = "pg_multixact/members",
- .long_segment_names = true,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- .nslots = multixact_member_buffers,
+ .nslots = multixact_member_buffers,
- .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
- .PagePrecedes = MultiXactMemberPagePrecedes,
- .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
);
/*
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 3fe60c5804b..6d9dda6b29b 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -242,9 +242,9 @@ SimpleLruAutotuneBuffers(int divisor, int max)
* Register a simple LRU cache in shared memory.
*/
void
-SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
+SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts * options)
{
- SlruOpts *options_copy;
+ SlruOpts *options_copy;
Assert(options->name != NULL);
Assert(options->nslots > 0);
@@ -265,7 +265,7 @@ SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
void
shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
{
- SlruOpts *options = (SlruOpts *) base_options;
+ SlruOpts *options = (SlruOpts *) base_options;
SlruDesc *desc = (SlruDesc *) base_desc;
char namebuf[NAMEDATALEN];
SlruShared shared;
@@ -356,7 +356,7 @@ shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
void
shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
{
- SlruOpts *options = (SlruOpts *) base_options;
+ SlruOpts *options = (SlruOpts *) base_options;
SlruDesc *desc = (SlruDesc *) base_desc;
int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index ca273fb4680..5f68c3f0cca 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -244,19 +244,19 @@ SUBTRANSShmemRequest(void *arg)
Assert(subtransaction_buffers != 0);
SimpleLruRequest(&SubTransSlruDesc,
- .name = "subtransaction",
- .Dir = "pg_subtrans",
- .long_segment_names = false,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
- .nslots = SUBTRANSShmemBuffers(),
+ .nslots = SUBTRANSShmemBuffers(),
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = SubTransPagePrecedes,
- .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
- .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
- );
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
}
static void
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 9be98afc1d9..e4e2ae2f6ff 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -804,27 +804,27 @@ AsyncShmemRequest(void *arg)
size = add_size(size, offsetof(AsyncQueueControl, backend));
ShmemRequestStruct(&AsyncQueueControlShmemDesc,
- .name = "Async Queue Control",
- .size = size,
- .ptr = (void **) &asyncQueueControl,
- );
+ .name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
SimpleLruRequest(&NotifySlruDesc,
- .name = "notify",
- .Dir = "pg_notify",
+ .name = "notify",
+ .Dir = "pg_notify",
- /* long segment names are used in order to avoid wraparound */
- .long_segment_names = true,
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- .nslots = notify_buffers,
+ .nslots = notify_buffers,
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = asyncQueuePagePrecedes,
- .errdetail_for_io_error = asyncQueueErrdetailForIoError,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
- .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
- );
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
}
static void
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..26cb824035d 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,10 +117,19 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
+ static ShmemStructDesc AioCtlShmemDesc;
+ static ShmemStructDesc AioBackendShmemDesc;
+ static ShmemStructDesc AioHandleShmemDesc;
+ static ShmemStructDesc AioHandleIOVShmemDesc;
+ static ShmemStructDesc AioHandleDataShmemDesc;
+
+ /* Resolve io_max_concurrency if not already done. */
/*
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
@@ -132,48 +149,57 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(&AioCtlShmemDesc,
+ .name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(&AioBackendShmemDesc,
+ .name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleShmemDesc,
+ .name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleIOVShmemDesc,
+ .name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleDataShmemDesc,
+ .name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +234,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 39984df31b4..fb75a208b65 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,34 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioUringShmemDesc = {
+ .name = "AioUringContext",
+ .ptr = (void **) &pgaio_uring_contexts,
+ };
+
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ AioUringShmemDesc.size = pgaio_uring_shmem_size();
+ ShmemRequestStruct(&AioUringShmemDesc);
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +301,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index efe38e9f113..82c8b098a9e 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
+
+ pgaio_worker_set_secondary_ptr();
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioWorkerShmemDesc = {
+ .name = "AioWorkerSubmissionQueue",
+ .ptr = (void **) &io_worker_submission_queue,
+ };
+ int queue_size;
+
+ AioWorkerShmemDesc.size =
+ MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+ ShmemRequestStruct(&AioWorkerShmemDesc);
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 1925d3deff9..2916fcd930e 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -118,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, SyncScanShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
@@ -305,7 +304,6 @@ CreateOrAttachShmemStructs(void)
BTreeShmemInit();
SyncScanShmemInit();
StatsShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 718e6b07f5c..a0143e0e9fe 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1149,17 +1149,17 @@ PredicateLockShmemRequest(void *arg)
* per-predicate-lock-target information.
*/
ShmemRequestHash(&PredicateLockTargetHashDesc,
- .name = "PREDICATELOCKTARGET hash",
+ .name = "PREDICATELOCKTARGET hash",
- .init_size = max_predicate_lock_targets,
- .max_size = max_predicate_lock_targets,
+ .init_size = max_predicate_lock_targets,
+ .max_size = max_predicate_lock_targets,
- .ptr = &PredicateLockTargetHash,
- .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
- .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
- .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
- .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
- );
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
@@ -1170,17 +1170,17 @@ PredicateLockShmemRequest(void *arg)
max_predicate_locks = max_predicate_lock_targets * 2;
ShmemRequestHash(&PredicateLockHashDesc,
- .name = "PREDICATELOCK hash",
+ .name = "PREDICATELOCK hash",
- .init_size = max_predicate_locks,
- .max_size = max_predicate_locks,
+ .init_size = max_predicate_locks,
+ .max_size = max_predicate_locks,
- .ptr = &PredicateLockHash,
- .hash_info.keysize = sizeof(PREDICATELOCKTAG),
- .hash_info.entrysize = sizeof(PREDICATELOCK),
- .hash_info.hash = predicatelock_hash,
- .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
- .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
);
/*
@@ -1198,11 +1198,11 @@ PredicateLockShmemRequest(void *arg)
* predicate locking.
*/
ShmemRequestStruct(&PredXactListShmemDesc,
- .name = "PredXactList",
- .size = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)))),
- .ptr = (void **) &PredXact,
+ .name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
);
/*
@@ -1210,15 +1210,15 @@ PredicateLockShmemRequest(void *arg)
* information for serializable transactions which have accessed data.
*/
ShmemRequestHash(&SerializableXidHashDesc,
- .name = "SERIALIZABLEXID hash",
+ .name = "SERIALIZABLEXID hash",
- .init_size = max_serializable_xacts,
- .max_size = max_serializable_xacts,
+ .init_size = max_serializable_xacts,
+ .max_size = max_serializable_xacts,
- .ptr = &SerializableXidHash,
- .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
- .hash_info.entrysize = sizeof(SERIALIZABLEXID),
- .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
);
/*
@@ -1235,45 +1235,45 @@ PredicateLockShmemRequest(void *arg)
max_rw_conflicts = max_serializable_xacts * 5;
ShmemRequestStruct(&RWConflictPoolShmemDesc,
- .name = "RWConflictPool",
- .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
- RWConflictDataSize),
- .ptr = (void **) &RWConflictPool,
- );
+ .name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
ShmemRequestStruct(&FinishedSerializableShmemDesc,
- .name = "FinishedSerializableTransactions",
- .size = sizeof(dlist_head),
- .ptr = (void **) &FinishedSerializableTransactions,
- );
+ .name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
SimpleLruRequest(&SerialSlruDesc,
- .name = "serializable",
- .Dir = "pg_serial",
- .long_segment_names = false,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
- .nslots = serializable_buffers,
+ .nslots = serializable_buffers,
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = SerialPagePrecedesLogically,
- .errdetail_for_io_error = serial_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
- .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
);
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
ShmemRequestStruct(&SerialControlShmemDesc,
- .name = "SerialControlData",
- .size = sizeof(SerialControlData),
- .ptr = (void **) &serialControl,
- );
+ .name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
}
static void
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index 820c7986854..1dbb0b62525 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -169,7 +169,7 @@ typedef struct SlruOpts
*/
int buffer_tranche_id;
int bank_tranche_id;
-} SlruOpts;
+} SlruOpts;
/*
* SlruDesc is an unshared structure that points to the active information
@@ -179,7 +179,7 @@ typedef struct SlruDesc
{
ShmemStructDesc base;
- SlruOpts options;
+ SlruOpts options;
SlruShared shared;
@@ -203,7 +203,7 @@ SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options);
+extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts * options);
#define SimpleLruRequest(desc, ...) \
SimpleLruRequestWithOpts(desc, &(SlruOpts){__VA_ARGS__})
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index 63d1d60ae36..e8e06be30c2 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -48,3 +48,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index 3c2a143b4d5..6bd1bec72c5 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -234,24 +234,24 @@ static void
test_slru_shmem_request(void *arg)
{
SimpleLruRequest(&TestSlruDesc,
- .name = "TestSLRU",
- .Dir = TestSlruDir,
-
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- .long_segment_names = true,
-
- .nslots = NUM_TEST_BUFFERS,
- .nlsns = 0,
-
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = test_slru_page_precedes_logically,
- .errdetail_for_io_error = test_slru_errdetail_for_io_error,
-
- /* let slru.c assign these */
- .buffer_tranche_id = 0,
- .bank_tranche_id = 0,
- );
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
+
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
--
2.47.3
[text/x-patch] v9-0015-Add-option-for-aligning-shmem-allocations.patch (3.9K, 16-v9-0015-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From 26e809af49a2002fff4a3bc561dfd112fd08b2c6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH v9 15/16] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 22 +++++++++++++---------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 73297e82265..1cbac7946e1 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -238,7 +238,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -399,6 +399,7 @@ ShmemGetRequestedSize(void)
{
size = add_size(size, request->options->size);
size = add_size(size, request->options->extra_size);
+ size = add_size(size, request->options->alignment);
}
return size;
@@ -588,7 +589,7 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
size_t allocated_size;
void *structPtr;
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size, request->options->alignment, &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -754,7 +755,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -773,7 +774,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -783,8 +784,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -800,14 +802,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -821,8 +824,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 64c1fb044c6..4939130aab1 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -63,6 +63,12 @@ typedef struct ShmemStructOpts
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* Extra space to reserve in the shared memory segment, but it's not part
* of the struct itself. This is used for shared memory hash tables that
--
2.47.3
[text/x-patch] v9-0016-Convert-all-remaining-subsystems-to-use-the-new-A.patch (115.3K, 17-v9-0016-Convert-all-remaining-subsystems-to-use-the-new-A.patch)
download | inline diff:
From ec51d4d1cc5302352ec3d365bda98c7ead2e8250 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 02:31:06 +0200
Subject: [PATCH v9 16/16] Convert all remaining subsystems to use the new API
---
src/backend/access/common/syncscan.c | 79 ++++----
src/backend/access/nbtree/nbtutils.c | 56 +++---
src/backend/access/transam/twophase.c | 77 +++----
src/backend/access/transam/xlog.c | 86 ++++----
src/backend/access/transam/xlogprefetcher.c | 54 ++---
src/backend/access/transam/xlogrecovery.c | 36 ++--
src/backend/access/transam/xlogwait.c | 52 ++---
src/backend/postmaster/autovacuum.c | 81 ++++----
src/backend/postmaster/bgworker.c | 107 +++++-----
src/backend/postmaster/checkpointer.c | 58 +++---
src/backend/postmaster/pgarch.c | 46 +++--
src/backend/postmaster/walsummarizer.c | 63 +++---
src/backend/replication/logical/launcher.c | 58 +++---
src/backend/replication/logical/logicalctl.c | 30 +--
src/backend/replication/logical/origin.c | 61 +++---
src/backend/replication/logical/slotsync.c | 44 ++--
src/backend/replication/slot.c | 66 +++---
src/backend/replication/walreceiverfuncs.c | 52 ++---
src/backend/replication/walsender.c | 61 +++---
src/backend/storage/aio/aio_init.c | 12 +-
src/backend/storage/aio/method_io_uring.c | 12 +-
src/backend/storage/aio/method_worker.c | 16 +-
src/backend/storage/buffer/buf_init.c | 158 +++++++--------
src/backend/storage/buffer/buf_table.c | 40 ++--
src/backend/storage/buffer/freelist.c | 94 ++++-----
src/backend/storage/ipc/ipci.c | 119 +----------
src/backend/storage/lmgr/lock.c | 126 ++++++------
src/backend/utils/activity/backend_status.c | 190 ++++++++----------
src/backend/utils/activity/pgstat_shmem.c | 161 ++++++++-------
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/buf_internals.h | 6 +-
src/include/storage/bufmgr.h | 4 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 26 +++
src/include/utils/backend_status.h | 8 -
.../injection_points/injection_points.c | 60 ++----
55 files changed, 1027 insertions(+), 1227 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..25522284faa 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,50 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ static ShmemStructDesc SyncScanShmemDesc;
+
+ ShmemRequestStruct(&SyncScanShmemDesc,
+ .name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..ecd4fb4df6f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,39 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
+ static ShmemStructDesc BTreeShmemDesc;
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(&BTreeShmemDesc,
+ .name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..f07cdae0325 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,11 +243,12 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
+ static ShmemStructDesc TwoPhaseShmemDesc;
Size size;
/* Need the fixed struct, the array of pointers, and the GTD structs */
@@ -248,46 +258,41 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(&TwoPhaseShmemDesc,
+ .name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2c1c6f88b74..190786c0d30 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -94,6 +94,7 @@
#include "storage/procarray.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -566,6 +567,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -574,6 +585,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -4923,7 +4935,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
}
@@ -4939,11 +4952,13 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
+ static ShmemStructDesc XLogCtlShmemDesc;
+ static ShmemStructDesc ControlFileShmemDesc;
Size size;
/*
@@ -4982,23 +4997,26 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(&XLogCtlShmemDesc,
+ .name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(&ControlFileShmemDesc,
+ .name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5016,36 +5034,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5102,6 +5101,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..5c2d3cdcdc9 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,28 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ static ShmemStructDesc XLogPrefetchShmemDesc;
+
+ ShmemRequestStruct(&XLogPrefetchShmemDesc,
+ .name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +340,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..e3d04e1f0df 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,23 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
+ static ShmemStructDesc XLogRecoveryShmemDesc;
- return size;
+ ShmemRequestStruct(&XLogRecoveryShmemDesc,
+ .name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..3af9f10133c 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,36 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
+ static ShmemStructDesc WaitLSNShmemDesc;
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(&WaitLSNShmemDesc,
+ .name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 6694f485216..f88f5aaf767 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3543,12 +3552,13 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
+ static ShmemStructDesc AutoVacuumShmemDesc;
Size size;
/*
@@ -3558,53 +3568,42 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(&AutoVacuumShmemDesc,
+ .name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index f2a62489d9c..ae577eae13b 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -29,6 +29,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -109,6 +110,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -152,77 +161,71 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
+ static ShmemStructDesc BackgroundWorkerShmemDesc;
Size size;
/* Array of workers is variably sized. */
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(&BackgroundWorkerShmemDesc,
+ .name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..da62fcef421 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,12 +959,13 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
+ static ShmemStructDesc CheckpointerShmemDesc;
Size size;
/*
@@ -967,39 +977,25 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(&CheckpointerShmemDesc,
+ .name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..888a53e96d0 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,34 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
-
- size = add_size(size, sizeof(PgArchData));
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- return size;
-}
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc PgArchShmemDesc;
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
+ ShmemRequestStruct(&PgArchShmemDesc,
+ .name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
+}
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
+{
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..9f87bd0f243 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,37 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ static ShmemStructDesc WalSummarizerShmemDesc;
+
+ ShmemRequestStruct(&WalSummarizerShmemDesc,
+ .name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..f1773279991 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,12 +981,13 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
+ static ShmemStructDesc ApplyLauncherShmemDesc;
Size size;
/*
@@ -987,7 +997,11 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(&ApplyLauncherShmemDesc,
+ .name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1042,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..98617b561df 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,16 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
+ static ShmemStructDesc LogicalDecodingCtlShmemDesc;
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(&LogicalDecodingCtlShmemDesc,
+ .name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..daa984330f1 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,50 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
+ static ShmemStructDesc ReplicationOriginShmemDesc;
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(&ReplicationOriginShmemDesc,
+ .name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..ec8ab1b3d67 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,29 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ static ShmemStructDesc SlotSyncShmemDesc;
+
+ ShmemRequestStruct(&SlotSyncShmemDesc,
+ .name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..ec44c99e04e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,43 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc ReplicationSlotsShmemDesc;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(&ReplicationSlotsShmemDesc,
+ .name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..6d506fc3f43 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,49 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalRcvShmemDesc;
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(&WalRcvShmemDesc,
+ .name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 66507e9c2dd..ffcd2dc2999 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3763,47 +3772,39 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalSndShmemDesc;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(&WalSndShmemDesc,
+ .name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index 26cb824035d..9efe53912ec 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -128,10 +128,10 @@ AioShmemRequest(void *arg)
static ShmemStructDesc AioHandleShmemDesc;
static ShmemStructDesc AioHandleIOVShmemDesc;
static ShmemStructDesc AioHandleDataShmemDesc;
-
- /* Resolve io_max_concurrency if not already done. */
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -153,25 +153,25 @@ AioShmemRequest(void *arg)
.name = "AioCtl",
.size = sizeof(PgAioCtl),
.ptr = (void **) &pgaio_ctl,
- );
+ );
ShmemRequestStruct(&AioBackendShmemDesc,
.name = "AioBackend",
.size = AioBackendShmemSize(),
.ptr = (void **) &AioBackendShmemPtr,
- );
+ );
ShmemRequestStruct(&AioHandleShmemDesc,
.name = "AioHandle",
.size = AioHandleShmemSize(),
.ptr = (void **) &AioHandleShmemPtr,
- );
+ );
ShmemRequestStruct(&AioHandleIOVShmemDesc,
.name = "AioHandleIOV",
.size = AioHandleIOVShmemSize(),
.ptr = (void **) &AioHandleIOVShmemPtr,
- );
+ );
ShmemRequestStruct(&AioHandleDataShmemDesc,
.name = "AioHandleData",
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index fb75a208b65..82fbbefc0b6 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -275,10 +275,7 @@ pgaio_uring_shmem_size(void)
static void
pgaio_uring_shmem_request(void *arg)
{
- static ShmemStructDesc AioUringShmemDesc = {
- .name = "AioUringContext",
- .ptr = (void **) &pgaio_uring_contexts,
- };
+ static ShmemStructDesc AioUringShmemDesc;
/*
* Kernel and liburing support for various features influences how much
@@ -286,8 +283,11 @@ pgaio_uring_shmem_request(void *arg)
*/
pgaio_uring_check_capabilities();
- AioUringShmemDesc.size = pgaio_uring_shmem_size();
- ShmemRequestStruct(&AioUringShmemDesc);
+ ShmemRequestStruct(&AioUringShmemDesc,
+ .name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index 82c8b098a9e..d9406958e6c 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -162,16 +162,18 @@ pgaio_worker_shmem_attach(void *arg)
static void
pgaio_worker_shmem_request(void *arg)
{
- static ShmemStructDesc AioWorkerShmemDesc = {
- .name = "AioWorkerSubmissionQueue",
- .ptr = (void **) &io_worker_submission_queue,
- };
+ static ShmemStructDesc AioWorkerShmemDesc;
int queue_size;
+ size_t size;
- AioWorkerShmemDesc.size =
- MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
pgaio_worker_control_shmem_size();
- ShmemRequestStruct(&AioWorkerShmemDesc);
+
+ ShmemRequestStruct(&AioWorkerShmemDesc,
+ .name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
}
static int
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..71e7c29116c 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,39 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ static ShmemStructDesc BufferDescriptorsShmemDesc;
+ static ShmemStructDesc BufferBlocksShmemDesc;
+ static ShmemStructDesc BufferIOCVArrayShmemDesc;
+ static ShmemStructDesc CkptBufferIdsShmemDesc;
+
+ ShmemRequestStruct(&BufferDescriptorsShmemDesc,
+ .name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(&BufferBlocksShmemDesc,
+ .name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(&BufferIOCVArrayShmemDesc,
+ .name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +112,51 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(&CkptBufferIdsShmemDesc,
+ .name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index 23d85fd32e2..db4f3f5ff35 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -32,37 +32,25 @@ typedef struct
static HTAB *SharedBufHash;
-
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
-
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(int size)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size, size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ static ShmemHashDesc SharedBufHashDesc;
+
+ ShmemRequestHash(&SharedBufHashDesc,
+ .name = "Shared Buffer Lookup Table",
+ .max_size = size,
+ .init_size = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..cfc29d73803 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,41 +379,22 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- register shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
+ static ShmemStructDesc StrategyCtlShmemDesc;
- return size;
-}
-
-/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
- */
-void
-StrategyInitialize(bool init)
-{
- bool found;
+ ShmemRequestStruct(&StrategyCtlShmemDesc,
+ .name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
/*
- * Initialize the shared buffer lookup hashtable.
+ * Request the shared buffer lookup hashtable.
*
* Since we can't tolerate running out of lookup table entries, we must be
* sure to specify an adequate table size here. The maximum steady-state
@@ -412,37 +403,26 @@ StrategyInitialize(bool init)
* happening in each partition concurrently, so we could need as many as
* NBuffers + NUM_BUFFER_PARTITIONS entries.
*/
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
+ BufTableShmemRequest(NBuffers + NUM_BUFFER_PARTITIONS);
+}
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+/*
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
+ */
+static void
+StrategyCtlShmemInit(void *arg)
+{
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2916fcd930e..5e182974008 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,36 +14,13 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
@@ -55,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -95,32 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, LWLockShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -159,7 +108,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -213,9 +161,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -246,68 +191,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
- BufferManagerShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 234643e4dd7..57dd80d9691 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,22 +443,19 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
- int64 init_table_size,
+ static ShmemHashDesc LockHashDesc;
+ static ShmemHashDesc ProcLockHashDesc;
+ static ShmemStructDesc FastPathShmemDesc;
+ long init_table_size,
max_table_size;
- bool found;
/*
* Compute init/max size to request for lock hashtables. Note these
@@ -456,47 +464,51 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
init_table_size = max_table_size / 2;
- /*
- * Allocate hash table for LOCK structs. This stores per-locked-object
- * information.
- */
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- init_table_size,
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
+ ShmemRequestHash(&LockHashDesc,
+ .name = "LOCK hash",
+ .init_size = init_table_size,
+ .max_size = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
init_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- init_table_size,
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION | HASH_PARTITION);
+ ShmemRequestHash(&ProcLockHashDesc,
+ .name = "PROCLOCK hash",
+ .init_size = init_table_size,
+ .max_size = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(&FastPathShmemDesc,
+ .name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin. size =
+ * add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3761,30 +3773,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /*
- * Since NLOCKENTS is only an estimate, add 10% safety margin.
- */
- size = add_size(size, size / 10);
-
- return size;
-}
/*
* GetLockStatusData - Return a summary of the lock manager's internal
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..9637c622d6e 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,114 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ static ShmemStructDesc BackendStatusArrayShmemDesc;
+ static ShmemStructDesc BackendAppnameBufferShmemDesc;
+ static ShmemStructDesc BackendClientHostnameBufferShmemDesc;
+ static ShmemStructDesc BackendActivityBufferSizeShmemDesc;
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendSslStatusBufferShmemDesc;
#endif
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendGssStatusBufferShmemDesc;
+#endif
+
+ ShmemRequestStruct(&BackendStatusArrayShmemDesc,
+ .name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(&BackendAppnameBufferShmemDesc,
+ .name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(&BackendClientHostnameBufferShmemDesc,
+ .name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(&BackendActivityBufferSizeShmemDesc,
+ .name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
+#ifdef USE_SSL
+ ShmemRequestStruct(&BackendSslStatusBufferShmemDesc,
+ .name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
+#endif
+
+#ifdef ENABLE_GSS
+ ShmemRequestStruct(&BackendGssStatusBufferShmemDesc,
+ .name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
- }
-
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
- {
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +194,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +208,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..602ec4978c6 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -150,101 +158,100 @@ StatsShmemSize(void)
}
/*
- * Initialize cumulative statistics system during startup
+ * Register shared memory area for cumulative statistics
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemRequest(void *arg)
{
- bool found;
- Size sz;
-
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ static ShmemStructDesc StatsShmemDesc;
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ ShmemRequestStruct(&StatsShmemDesc,
+ .name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
- Assert(!found);
+/*
+ * Initialize cumulative statistics system during startup
+ */
+static void
+StatsShmemInit(void *arg)
+{
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- if (!kind_info)
- continue;
+ if (!kind_info)
+ continue;
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
+ {
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index dcc12eb8cbe..1a098a91444 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -246,8 +246,6 @@ extern char *GetMockAuthenticationNonce(void);
extern bool DataChecksumsEnabled(void);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..d2f407febdd 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,8 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
+extern void BufTableShmemRequest(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index e8e06be30c2..206d6b586ad 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -25,10 +25,18 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
@@ -36,6 +44,9 @@ PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -43,11 +54,26 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..592ecfad3da 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,26 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc InjectionPointsShmemDesc;
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(&InjectionPointsShmemDesc,
+ .name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +587,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-02 06:58 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-02 06:58 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Wed, Apr 1, 2026 at 11:47 PM Heikki Linnakangas <[email protected]> wrote:
>
> Yet another version attached (also available at:
> https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-9). The
> main change is the shape of the ShmemRequest*() calls:
>
> On 27/03/2026 02:51, Heikki Linnakangas wrote:
> > Another idea is to use a macro to hide that from pgindent, which would
> > make the calls little less verbose anyway:
> >
> > #define ShmemRequestStruct(desc, ...) ShmemRequestStructWithOpts(desc,
> > &(ShmemRequestStructOpts) { __VA_ARGS__ })
> >
> > Then the call would be simply:
> >
> > ShmemRequestStruct(&pgssSharedStateDesc,
> > .name = "pg_stat_statements",
> > .size = sizeof(pgssSharedState),
> > .ptr = (void **) &pgss,
> > );
>
> I went with that approach. We're already doing something similar with
> XL_ROUTINE in xlogreader.h:
>
> #define XL_ROUTINE(...) &(XLogReaderRoutine){__VA_ARGS__}
>
> The calls look like this:
>
> xlogreader =
> XLogReaderAllocate(wal_segment_size, NULL,
> XL_ROUTINE(.page_read = &XLogPageRead,
> .segment_open = NULL,
> .segment_close = wal_segment_close),
> private);
>
> If we followed that example, ShmemRequestStruct() calls would look like
> this:
>
> ShmemRequestStruct(&pgssSharedStateDesc,
> SHMEM_STRUCT_OPTS(.name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> );
>
> However, I don't like the deep indentation, it feels like the important
> stuff is buried to the right. And pgindent insists on that. So I went
> with the proposal I quoted above, turning ShmemRequestStruct(...) itself
> into a macro. If you need more complex options setup, you can set up the
> struct without the macro and call ShmemRequestStructWithOpts() directly,
> but so far all of the callers can use the macro.
>
I like this. I have tried it only for the resizable_shmem structure
which is not complex.
>
> Ashutosh, I think I've addressed most of your comments so far. I'm
> replying to just a few of them here that might need more discussion:
>
Thanks.
> >
> > +} shmem_startup_state;
> >
> > This isn't just startup state since the backend can toggle between
> > DONE and LATE_ATTACH_OR_INIT states after the startup. Probably
> > "shmem_state" would be a better name.
>
> Renamed to "shmem_request_state". And renamed "LATE_ATTACH_OR_INIT" to
> "AFTER_STARTUP_ATTACH_OR_INIT" to match the terminology I used elsewhere.
>
> I'm still not entirely happy with this state machine. It seems useful to
> have it for sanity checking, but it still feels a little unclear what
> state you're in at different points in the code, and as an aesthetic
> thing, the whole enum feels too prominent given that it's just for
> sanity checks.
I am ok even if it is used just for sanity checks - but with the
shared structure requests coming at any time during the life of a
server, it would be easy to get lost without those sanity checks. I
also see it being used in RegisterShmemCallbacks(), so it's not for
just sanity checks, right?
>
> > + ShmemStructDesc *desc = area->desc;
> > +
> > + AttachOrInit(desc, false, true);
> > + }
> > + list_free(requested_shmem_areas);
> > + requested_shmem_areas = NIL;
> >
> > If we pop all the nodes from the list, then the list should be NIL
> > right? Why do we need to free it?
> >
> > + else if (!init_allowed)
> > + {
> >
> > For the sake of documentation and sanity, I would add
> > Assert(!index_entry) here, possibly with a comment. Otherwise it feels
> > like we might be leaving a half-initialized entry in the hash table.
> >
> > What if attach_allowed is false and the entry is not found? Should we
> > throw an error in that case too? It would be foolish to call
> > AttachOrInit with both init_allowed and attach_allowed set to false,
> > but the API allows it and we should check for that.
> >
> > It feels like we should do something about the arguments. The function
> > is hard to read. init_allowed is actually the action the caller wants
> > to take if the entry is not found, and attach_allowed is the action
> > the caller wants to take if the entry is found.
> >
> > Also explain in the comment what does attach mean here especially in
> > case of fixed sized structures.
>
> I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init'
> and 'may_attach'. But more importantly I added comments to explain the
> different usages. Hope that helps..
The explanation in the prologue looks good. But the function is still
confusing. Instead of if ... else fi ... chain, I feel organizing this
as below would make it more readable. (this was part of one of my
earlier edit patches).
if (found)
...
else
{
if (!may_init)
error
if (!index_entry)
error
... rest of the code to initialize and attach
}
But other than that I don't have any other brilliant ideas.
>
> On 01/04/2026 14:59, Ashutosh Bapat wrote:
> > 0008
> > ------
> > - LWLockRelease(AddinShmemInitLock);
> > + /* The hash table must be initialized already */
> > + Assert(pgss_hash != NULL);
> >
> > Does it make sense to also Assert(pgss)? A broader question is do we
> > want to make it a pattern that every user of ShmemRequest*() also
> > Assert()s that the pointer is non-NULL in the init callback? It is a
> > test that the ShmemRequest*(), which is far from, init_fn is working
> > correctly.
>
> The function does a lot of accesses of 'pgss' so if that's NULL you'll
> get a crash pretty quickly. I'm not sure if the Assert(pgss_hash !=
> NULL) is really needed either, but I'm inclined to keep it, as pgss_hash
> might not otherwise be accessed in the function, and there are runtime
> checks for it in the other functions, so if it's not initialized for
> some reason, things might still appear to work to some extent. I don't
> think I want to have that as a broader pattern though.
In Assert build, an Assert() at least appears in the server log file,
that gives a good direction to start investigation. Without Assert, it
gives segmentation faults without any idea where it came from. That's
a mild benefit of assert.
>
> > + /*
> > + * Extra space to reserve in the shared memory segment, but it's not part
> > + * of the struct itself. This is used for shared memory hash tables that
> > + * can grow beyond the initial size when more buckets are allocated.
> > + */
> > + size_t extra_size;
> >
> > When we introduce resizable structures (where even the hash table
> > directly itself could be resizable), we will introduce a new field
> > max_size which is easy to get confused with extra_size. Maybe we can
> > rename extra_size to something like "auxilliary_size" to mean size of
> > the auxiliary parts of the structure which are not part of the main
> > struct itself.
> >
> > + /*
> > + * max_size is the estimated maximum number of hashtable entries. This is
> > + * not a hard limit, but the access efficiency will degrade if it is
> > + * exceeded substantially (since it's used to compute directory size and
> > + * the hash table buckets will get overfull).
> > + */
> > + size_t max_size;
> > +
> > + /*
> > + * init_size is the number of hashtable entries to preallocate. For a
> > + * table whose maximum size is certain, this should be equal to max_size;
> > + * that ensures that no run-time out-of-shared-memory failures can occur.
> > + */
> > + size_t init_size;
> >
> > Everytime I look at these two fields, I question whether those are the
> > number of entries (i.e. size of the hash table) or number of bytes
> > (size of the memory). I know it's the former, but it indicates that
> > something needs to be changed here, like changing the names to have
> > _entries instead of _size, or changing the type to int64 or some such.
> > Renaming to _entries would conflict with dynahash APIs since they use
> > _size, so maybe the latter?
>
> I hear you, but I didn't change these yet. If we go with the patches
> from the "Shared hash table allocations" thread, max_size and init_size
> will be merged into one. I'll try to settle that thread before making
> changes here.
Will review those patches next.
>
> > -void
> > -InitProcGlobal(void)
> > +static void
> > +ProcGlobalShmemInit(void *arg)
> > {
>
> I'm not sure what you meant to say here, but I did notice that there
> were a bunch of references to InitProcGlobal() left over in comments.
> Fixed those.
Oh, I just wanted to say that the new version reads much better than
the old version, which had ShmemStructInit() sprinkled at seemingly
random places. I missed writing that. Nothing serious there.
I also rebased my resizable shmem patch on v9. Attached here. I have
addressed the following open items from the list at [1]
1. The test is stable now. I found a way to make (roughly) sure that
we are not allocating more than required memory for a resizable
structure.
2. Disable the feature on platforms that do not have
MADV_POPULATE_WRITE and MADV_REMOVE. The feature is also disabled for
EXEC_BACKEND case. I have tested the EXEC_BACKEND case, but I have not
tested platforms which do not have those constants defined or on
Windows.
The first two items from [1] need some discussion still.
[1] https://www.postgresql.org/message-id/[email protected]...
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] v9-resizable_shmem_struct.patch.nocibot (45.6K, 2-v9-resizable_shmem_struct.patch.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-02 22:10 Matthias van de Meent <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 2 replies; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-02 22:10 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Wed, 1 Apr 2026 at 20:17, Heikki Linnakangas <[email protected]> wrote:
>
> Yet another version attached (also available at:
> https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-9). The
> main change is the shape of the ShmemRequest*() calls:
I didn't read the whole thread, as it's quite long, but did look at
the patchset for a while to figure out where it's going.
0005:
A few assorted comments:
While I do think it's an improvement over the current APIs, the
improvement seems to be mostly concentrated in the RequestStruct/Hash
department, with only marginal improvements in RegisterShmemCallbacks.
I feel like it's missing the important part: I'd like
direct-from-_PG_init() ShmemRequestStruct/Hash calls. If
ShmemRequestStruct/Hash had a size callback as alternative to the size
field (which would then be called after preload_libraries finishes)
then that would be sufficient for most shmem allocations, and it'd
simplify shmem management for most subsystems.
We'd still need the shmem lifecycle hooks/RegisterShmemCallbacks to
allow conditionally allocated shmem areas (e.g. those used in aio),
but I think that, in general, we shouldn't need a separate callback
function just to get started registering shmem structures.
I also noticed that ShmemCallbacks.%_arg are generally undocumented,
and I couldn't find any users in core (at the end of the patchset)
that actually use the argument. Could it be I missed something?
I don't understand the use of ShmemStructDesc. They generally/always
are private to request_fn(), and their fields are used exclusively
inside the shmem mechanisms, with no reads of its fields that can't
already be deduced from context. Why do we need that struct
everywhere?
> +++ b/src/backend/storage/ipc/shmem.c
[...]
> + /* Check that it's not already registered in this process */
> + foreach_ptr(ShmemStructDesc, existing, pending_shmem_requests)
> + {
> + if (strcmp(existing->name, options->name) == 0)
> + ereport(ERROR,
> + (errmsg("shared memory struct \"%s\" is already registered",
> + options->name)));
> + }
> +
> + request = palloc(sizeof(ShmemRequest));
> + request->options = options;
> + request->desc = desc;
> + request->kind = kind;
> + pending_shmem_requests = lappend(pending_shmem_requests, request);
Apparently, pending_shmem_requests is a list of ShmemRequest, but the
iteration just above on the same list assumes ShmemStructDesc, which
seems wrong to me.
00017:
I like this idea, but I think it missed its chance to make good on an
opportunity to reduce waste in alignments: We know which structs we're
going to allocate at which alignments, so we could save space by
packing the structs. I don't expect it to save much, but it could be a
few 100 of kbs with a few BLCKSZ-aligned allocations.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-03 13:12 Ashutosh Bapat <[email protected]>
parent: Matthias van de Meent <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-03 13:12 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Apr 3, 2026 at 3:40 AM Matthias van de Meent
<[email protected]> wrote:
>
> On Wed, 1 Apr 2026 at 20:17, Heikki Linnakangas <[email protected]> wrote:
> >
>
> While I do think it's an improvement over the current APIs, the
> improvement seems to be mostly concentrated in the RequestStruct/Hash
> department, with only marginal improvements in RegisterShmemCallbacks.
> I feel like it's missing the important part: I'd like
> direct-from-_PG_init() ShmemRequestStruct/Hash calls. If
> ShmemRequestStruct/Hash had a size callback as alternative to the size
> field (which would then be called after preload_libraries finishes)
> then that would be sufficient for most shmem allocations, and it'd
> simplify shmem management for most subsystems.
> We'd still need the shmem lifecycle hooks/RegisterShmemCallbacks to
> allow conditionally allocated shmem areas (e.g. those used in aio),
> but I think that, in general, we shouldn't need a separate callback
> function just to get started registering shmem structures.
>
> I also noticed that ShmemCallbacks.%_arg are generally undocumented,
> and I couldn't find any users in core (at the end of the patchset)
> that actually use the argument. Could it be I missed something?
>
> I don't understand the use of ShmemStructDesc. They generally/always
> are private to request_fn(), and their fields are used exclusively
> inside the shmem mechanisms, with no reads of its fields that can't
> already be deduced from context. Why do we need that struct
> everywhere?
My resizable shared memory structure patches use it as a handle to the
structure to be resized.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 00:45 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-04 00:45 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; Matthias van de Meent <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 03/04/2026 16:12, Ashutosh Bapat wrote:
> On Fri, Apr 3, 2026 at 3:40 AM Matthias van de Meent
> <[email protected]> wrote:
>> While I do think it's an improvement over the current APIs, the
>> improvement seems to be mostly concentrated in the RequestStruct/Hash
>> department, with only marginal improvements in RegisterShmemCallbacks.
>> I feel like it's missing the important part: I'd like
>> direct-from-_PG_init() ShmemRequestStruct/Hash calls. If
>> ShmemRequestStruct/Hash had a size callback as alternative to the size
>> field (which would then be called after preload_libraries finishes)
>> then that would be sufficient for most shmem allocations, and it'd
>> simplify shmem management for most subsystems.
>> We'd still need the shmem lifecycle hooks/RegisterShmemCallbacks to
>> allow conditionally allocated shmem areas (e.g. those used in aio),
>> but I think that, in general, we shouldn't need a separate callback
>> function just to get started registering shmem structures.
>>
>> I also noticed that ShmemCallbacks.%_arg are generally undocumented,
>> and I couldn't find any users in core (at the end of the patchset)
>> that actually use the argument. Could it be I missed something?
None of the current code currently uses it, that's correct. I felt it
might become very handy in the future or in extensions, if you wanted to
reuse the same function for initializing different shmem areas, for
example. It's a pretty common pattern to have an opaque pointer like
that in any callbacks.
>> I don't understand the use of ShmemStructDesc. They generally/always
>> are private to request_fn(), and their fields are used exclusively
>> inside the shmem mechanisms, with no reads of its fields that can't
>> already be deduced from context. Why do we need that struct
>> everywhere?
>
> My resizable shared memory structure patches use it as a handle to the
> structure to be resized.
Right. And hash tables and SLRUs use a desc-like object already, so for
symmetry it feels natural to have it for plain structs too.
I wonder if we should make it optional though, for the common case that
you have no intention of doing anything more with the shmem region that
you'd need a desc for. I'm thinking you could just pass NULL for the
desc pointer:
ShmemRequestStruct(NULL,
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
};
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 00:49 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 3 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-04 00:49 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 02/04/2026 09:58, Ashutosh Bapat wrote:
> On Wed, Apr 1, 2026 at 11:47 PM Heikki Linnakangas <[email protected]> wrote:
>>> + /*
>>> + * Extra space to reserve in the shared memory segment, but it's not part
>>> + * of the struct itself. This is used for shared memory hash tables that
>>> + * can grow beyond the initial size when more buckets are allocated.
>>> + */
>>> + size_t extra_size;
>>>
>>> When we introduce resizable structures (where even the hash table
>>> directly itself could be resizable), we will introduce a new field
>>> max_size which is easy to get confused with extra_size. Maybe we can
>>> rename extra_size to something like "auxilliary_size" to mean size of
>>> the auxiliary parts of the structure which are not part of the main
>>> struct itself.
>>>
>>> + /*
>>> + * max_size is the estimated maximum number of hashtable entries. This is
>>> + * not a hard limit, but the access efficiency will degrade if it is
>>> + * exceeded substantially (since it's used to compute directory size and
>>> + * the hash table buckets will get overfull).
>>> + */
>>> + size_t max_size;
>>> +
>>> + /*
>>> + * init_size is the number of hashtable entries to preallocate. For a
>>> + * table whose maximum size is certain, this should be equal to max_size;
>>> + * that ensures that no run-time out-of-shared-memory failures can occur.
>>> + */
>>> + size_t init_size;
>>>
>>> Everytime I look at these two fields, I question whether those are the
>>> number of entries (i.e. size of the hash table) or number of bytes
>>> (size of the memory). I know it's the former, but it indicates that
>>> something needs to be changed here, like changing the names to have
>>> _entries instead of _size, or changing the type to int64 or some such.
>>> Renaming to _entries would conflict with dynahash APIs since they use
>>> _size, so maybe the latter?
>>
>> I hear you, but I didn't change these yet. If we go with the patches
>> from the "Shared hash table allocations" thread, max_size and init_size
>> will be merged into one. I'll try to settle that thread before making
>> changes here.
>
> Will review those patches next.
Those are now committed, and here's a new version rebased over those
changes. The hash options is now called 'nelems', and the 'extra_size'
in ShmemStructOpts is gone.
Plus a bunch of other fixes and cleanups. I also reordered and
re-grouped the patches a little, into more logical increments I hope.
- Heikki
Attachments:
[text/x-patch] 0001-refactor-Move-ShmemInitHash-to-separate-file.patch (11.0K, 2-0001-refactor-Move-ShmemInitHash-to-separate-file.patch)
download | inline diff:
From c11b535e7aa05453059d2cab7be79fa803753f6f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH 01/14] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 108 ----------------------
src/backend/storage/ipc/shmem_hash.c | 130 +++++++++++++++++++++++++++
src/include/storage/shmem.h | 9 +-
5 files changed, 139 insertions(+), 110 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cb51ad62f8..c994f7674ec 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -96,9 +96,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -257,29 +254,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/*
- * ShmemHashAlloc -- alloc callback for shared memory hash tables
- *
- * Carve out the allocation from a pre-allocated region. All shared memory
- * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
- * happen upfront during initialization and no locking is required.
- */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
- void *result;
-
- size = MAXALIGN(size);
-
- if (allocator->end - allocator->next < size)
- return NULL;
- result = allocator->next;
- allocator->next += size;
-
- return result;
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -341,88 +315,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * nelems is the maximum number of hashtable entries.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 nelems, /* size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- size_t size;
- void *location;
-
- size = hash_estimate_size(nelems, infoP->entrysize);
-
- /* look it up in the shmem index or allocate */
- location = ShmemInitStruct(name, size, &found);
-
- return shmem_hash_create(location, size, found,
- name, nelems, infoP, hash_flags);
-}
-
-/*
- * Initialize or attach to a shared hash table in the given shmem region.
- *
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
- */
-static HTAB *
-shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-{
- shmem_hash_allocator allocator;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory and have
- * all elements allocated upfront. We don't support growing because we'd
- * need to grow the underlying shmem region with it.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (!found)
- {
- allocator.next = (char *) location;
- allocator.end = (char *) location + size;
- infoP->alloc_arg = &allocator;
- }
- else
- {
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
- hash_flags |= HASH_ATTACH;
- }
-
- return hash_create(name, nelems, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..0b05730129e
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,130 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+static void *ShmemHashAlloc(Size size, void *alloc_arg);
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * nelems is the maximum number of hashtable entries.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 nelems, /* size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ size_t size;
+ void *location;
+
+ size = hash_estimate_size(nelems, infoP->entrysize);
+
+ /* look it up in the shmem index or allocate */
+ location = ShmemInitStruct(name, size, &found);
+
+ return shmem_hash_create(location, size, found,
+ name, nelems, infoP, hash_flags);
+}
+
+/*
+ * Initialize or attach to a shared hash table in the given shmem region.
+ *
+ * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
+ * share the logic for bootstrapping the ShmemIndex hash table.
+ */
+HTAB *
+shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+{
+ shmem_hash_allocator allocator;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory and have
+ * all elements allocated upfront. We don't support growing because we'd
+ * need to grow the underlying shmem region with it.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (!found)
+ {
+ allocator.next = (char *) location;
+ allocator.end = (char *) location + size;
+ infoP->alloc_arg = &allocator;
+ }
+ else
+ {
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+ hash_flags |= HASH_ATTACH;
+ }
+
+ return hash_create(name, nelems, infoP, hash_flags);
+}
+
+/*
+ * ShmemHashAlloc -- alloc callback for shared memory hash tables
+ *
+ * Carve out the allocation from a pre-allocated region. All shared memory
+ * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
+ * happen upfront during initialization and no locking is required.
+ */
+static void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
+ void *result;
+
+ size = MAXALIGN(size);
+
+ if (allocator->end - allocator->next < size)
+ return NULL;
+ result = allocator->next;
+ allocator->next += size;
+
+ return result;
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 2fccbdb534c..bbf32523c0b 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -31,15 +31,20 @@ typedef struct PGShmemHeader PGShmemHeader; /* avoid including
extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
+extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+/* shmem_hash.c */
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
--
2.47.3
[text/x-patch] 0002-Introduce-a-new-mechanism-for-registering-shared-mem.patch (59.1K, 3-0002-Introduce-a-new-mechanism-for-registering-shared-mem.patch)
download | inline diff:
From fac682ae503a5d03795145328e2bc4bbcae2feb9 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:01:39 +0300
Subject: [PATCH 02/14] Introduce a new mechanism for registering shared memory
areas
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register or "request" the subsystem's shared memory
needs. This is more ergonomic, as you only need to calculate the size
once.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 1 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 18 +-
src/backend/storage/ipc/ipci.c | 29 +-
src/backend/storage/ipc/shmem.c | 798 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 77 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 9 +-
src/include/storage/shmem.h | 213 ++++++-
src/test/modules/test_aio/test_aio.c | 1 -
src/tools/pgindent/typedefs.list | 9 +-
13 files changed, 1129 insertions(+), 199 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c52c0a6023d..26d3717c2cb 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -373,6 +373,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..75423104be8 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +673,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eb4f3eb72d4..01b064d62ea 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -951,7 +951,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3232,7 +3239,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7aab5da3386..5333e528e1f 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -100,8 +100,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -176,6 +177,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -220,7 +228,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -251,11 +269,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c994f7674ec..606b545e8fe 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,43 +19,115 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size and
- * cannot grow beyond that. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -74,6 +146,76 @@
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructDesc *desc;
+ ShmemStructOpts *options;
+ ShmemAreaKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -105,35 +247,393 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (64)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static bool AttachOrInitShmemIndexEntry(ShmemRequest *request,
+ bool may_init, bool may_attach);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
/*
- * A very simple allocator used to carve out different parts of a hash table
- * from a previously allocated contiguous shared memory area.
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(ShmemStructDesc *desc, const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(desc, options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructDesc *desc, ShmemStructOpts *options,
+ ShmemAreaKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemStructDesc, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->desc = desc;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachOrInitShmemIndexEntry(request, true, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachOrInitShmemIndexEntry(request, false, true);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Workhorse to insert or look up a named shmem area in the shared memory
+ * index, and initialize or attach to it.
+ *
+ * Note that this only does the basic initialization depending ShmemAreaKind,
+ * like setting the global pointer variable to the area for SHMEM_KIND_STRUCT
+ * or setting up the backend-private HTAB control struct. This does *not*
+ * call the callbacks specific to the subsystem that requested it. That's
+ * done later after all the shmem areas have been initialized or attached to.
+ *
+ * may_init == true && may_attach == false is used at postmaster startup to
+ * allocate all the areas. An error is thrown if the area already exists.
+ *
+ * may_init == false && may_attach == true is used at backend startup in
+ * EXEC_BACKEND mode to attach to all the areas. The area is expected to
+ * already be initialized, an error is thrown if not.
+ *
+ * may_init == true && may_attach == true is used when a shared memory is
+ * requested after startup, with SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP.
+ *
+ * (may_init == false && may_attach == false is not used, it would always
+ * raise an error)
*/
-typedef struct shmem_hash_allocator
+static bool
+AttachOrInitShmemIndexEntry(ShmemRequest *request,
+ bool may_init, bool may_attach)
{
- char *next; /* start of free space in the area */
- char *end; /* end of the shmem area */
-} shmem_hash_allocator;
+ /*
+ * If called after postmaster startup, we need to immediately also
+ * initialize or attach to the area.
+ */
+ ShmemStructDesc *desc = request->desc;
+ ShmemIndexEnt *index_entry;
+ bool found;
+
+ /* If both are false, we'll fail no matter what */
+ Assert(may_init || may_attach);
+
+ desc->name = request->options->name;
+ desc->ptr = NULL;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, request->options->name,
+ may_init ? HASH_ENTER_NULL : HASH_FIND, &found);
+ if (found)
+ {
+ /* Already present, just attach to it */
+ if (!may_attach)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", desc->name);
+
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ elog(ERROR, "shared memory struct \"%s\" is already registered with different size",
+ desc->name);
+ }
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(desc, request->options);
+ break;
+ }
+ }
+ else if (!may_init)
+ {
+ /* attach was requested, but it was not found */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ desc->name)));
+ }
+ else
+ {
+ /*
+ * We inserted the entry to the shared memory index. Allocate
+ * requested amount of shared memory for it, and do basic
+ * initializion.
+ */
+ size_t allocated_size;
+ void *structPtr;
+
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, desc->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ desc->name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ desc->ptr = index_entry->location;
+ desc->size = index_entry->size;
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(desc, request->options);
+ break;
+ }
+ }
+
+ return found;
+}
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_nelems;
HASHCTL info;
int hash_flags;
@@ -142,6 +642,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -186,19 +696,21 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_nelems = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_FIXED_SIZE;
if (!IsUnderPostmaster)
{
- ShmemAllocator->index_size = hash_estimate_size(SHMEM_INDEX_SIZE, info.entrysize);
+ ShmemAllocator->index_size = hash_estimate_size(hash_nelems, info.entrysize);
ShmemAllocator->index = (HASHHDR *) ShmemAlloc(ShmemAllocator->index_size);
}
ShmemIndex = shmem_hash_create(ShmemAllocator->index,
ShmemAllocator->index_size,
IsUnderPostmaster,
- "ShmemIndex", SHMEM_INDEX_SIZE,
+ "ShmemIndex", hash_nelems,
&info, hash_flags);
Assert(ShmemIndex != NULL);
@@ -219,6 +731,23 @@ InitShmemAllocator(PGShmemHeader *seghdr)
}
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -316,92 +845,141 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /* After-startup initialization */
+ bool found = false;
- Assert(ShmemIndex != NULL);
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ Assert(pending_shmem_requests == NIL);
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
- if (!result)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
- }
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- if (*foundPtr)
- {
/*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
*/
- if (result->size != size)
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
+ found = AttachOrInitShmemIndexEntry(request, true, true);
}
- structPtr = result->location;
- }
- else
- {
- Size allocated_size;
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
+ /*
+ * Finish initialization or attaching to the shmem ares by calling the
+ * appropriate callback.
+ *
+ * FIXME: What to do if multiple shmem areas were requested, and some
+ * of them are already initialized but not all? We expect all shmem
+ * areas requested by a single callback to form a coherent unit.
+ */
+ if (found)
{
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+ else
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+
+ LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+
+ return;
}
- LWLockRelease(ShmemIndexLock);
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+}
+
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(ShmemAddrIsValid(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ ShmemStructDesc desc;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ };
+ ShmemRequest request = {&desc, &options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ /* look it up immediately */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ *foundPtr = AttachOrInitShmemIndexEntry(&request, true, true);
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(desc.ptr != NULL);
+ return desc.ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 0b05730129e..4e5b8a66fde 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -21,9 +21,81 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "utils/memutils.h"
+
+/*
+ * A very simple allocator used to carve out different parts of a hash table
+ * from a previously allocated contiguous shared memory area.
+ */
+typedef struct shmem_hash_allocator
+{
+ char *next; /* start of free space in the area */
+ char *end; /* end of the shmem area */
+} shmem_hash_allocator;
static void *ShmemHashAlloc(Size size, void *alloc_arg);
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(ShmemHashDesc *desc, const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_estimate_size(options_copy->nelems,
+ options_copy->hash_info.entrysize);
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = shmem_hash_create(desc->base.ptr, base_desc->size, false,
+ desc->base.name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
+
+void
+shmem_hash_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) base_desc;
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = desc->base.ptr;
+ Assert(options->hash_info.hctl != NULL);
+ desc->ptr = shmem_hash_create(desc->base.ptr, base_desc->size, true,
+ desc->base.name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = desc->ptr;
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -40,9 +112,8 @@ static void *ShmemHashAlloc(Size size, void *alloc_arg);
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..af7cc86d80a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4155,7 +4155,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index bbf32523c0b..f57672fbc58 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,43 +24,218 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemAreaKind;
+
+/*
+ * ShmemStructDesc is backend-private handle for a shared memory area
+ * requested with ShmemRequestStruct().
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name and size of the shared memory area. */
+ const char *name;
+
+ void *ptr;
+ size_t size;
+} ShmemStructDesc;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
+
+ ssize_t size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+/*
+ * Backend-private handle for a named shared memory hash table, similar to
+ * ShmemStructDesc.
+ */
+typedef struct ShmemHashDesc
+{
+ /*
+ * Descriptor of the underlying fixed-size allocated area where the hash
+ * table lives.
+ */
+ ShmemStructDesc base;
+
+ HTAB *ptr;
+} ShmemHashDesc;
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * 'nelems' is the max number of elements for the hash table.
+ */
+ int64 nelems;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * ALLOW_AFTER_STARTUP: Allow these shared memory usages to be registered
+ * after postmaster startup. Normally, registering a shared memory system
+ * after postmaster startup is not allowed e.g. in an add-in library loaded
+ * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
+ * called immediately after registration, to initialize or attach to the
+ * requested shared memory areas. This is not used by any built-in
+ * subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
/* shmem.c */
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ResetShmemAllocator(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
-extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
+
+extern void ShmemRequestInternal(ShmemStructDesc *desc, ShmemStructOpts *options,
+ ShmemAreaKind kind);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(desc, ...) \
+ ShmemRequestStructWithOpts(desc, &(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(desc, ...) \
+ ShmemRequestHashWithOpts(desc, &(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(ShmemStructDesc *desc, const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(ShmemHashDesc *desc, const ShmemHashOpts *options);
+extern void ShmemCallRequestCallbacks(void);
+
+/* legacy shmem allocation functions */
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
/* shmem_hash.c */
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void shmem_hash_init(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+extern void shmem_hash_attach(ShmemStructDesc *base_desc, ShmemStructOpts *options);
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..34487a05486 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -77,7 +77,6 @@ static InjIoErrorState *inj_io_error_state;
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-
static PgAioHandle *last_handle;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c72f6c59573..5894893997c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2863,9 +2863,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemAreaKind
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] 0003-Add-test-module-to-test-after-startup-shmem-allocati.patch (10.2K, 4-0003-Add-test-module-to-test-after-startup-shmem-allocati.patch)
download | inline diff:
From ddf6d1b38c01b36b28e98953e680aabf6aefbdbf Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:10:31 +0300
Subject: [PATCH 03/14] Add test module to test after-startup shmem allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 ++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 ++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 105 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 226 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..f1b04c99969 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -48,6 +48,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..fc99552d9ab 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -49,6 +49,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..31c735be570
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,105 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ static ShmemStructDesc TestShmemDesc;
+
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(&TestShmemDesc,
+ .name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem,
+ );
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5894893997c..5358229b73f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.47.3
[text/x-patch] 0004-Convert-pg_stat_statements-to-use-the-new-interface.patch (11.4K, 5-0004-Convert-pg_stat_statements-to-use-the-new-interface.patch)
download | inline diff:
From 5282dfa395f14dd89fe7e34413bc92c68a9f9a7e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:24 +0300
Subject: [PATCH 04/14] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 178 ++++++++----------
1 file changed, 82 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5494d41dca1..2384212288e 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,47 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ static ShmemHashDesc pgssSharedHashDesc;
+ static ShmemStructDesc pgssSharedStateDesc;
+
+ ShmemRequestHash(&pgssSharedHashDesc,
+ .name = "pg_stat_statements hash",
+ .ptr = &pgss_hash,
+ .nelems = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestStruct(&pgssSharedStateDesc,
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +549,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1339,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1360,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1379,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1519,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1808,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2046,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2086,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2716,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2810,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] 0005-Introduce-registry-of-built-in-subsystems.patch (7.3K, 6-0005-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From 61715b45ca2d0206b8eda692e757ea2b32ddd96b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:02 +0300
Subject: [PATCH 05/14] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
src/tools/pginclude/headerscheck | 1 +
9 files changed, 88 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 26d3717c2cb..22672ffec1a 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -362,6 +362,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 75423104be8..7b81200d3c2 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -663,6 +663,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 01b064d62ea..3fd6a5b9a9b 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -921,6 +921,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 5333e528e1f..67ab4a78192 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -51,6 +51,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -251,6 +252,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index af7cc86d80a..6c6c2243d9e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4137,6 +4137,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 14c466cc237..24f7416185e 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -131,6 +131,7 @@ do
test "$f" = src/include/postmaster/proctypelist.h && continue
test "$f" = src/include/regex/regerrs.h && continue
test "$f" = src/include/storage/lwlocklist.h && continue
+ test "$f" = src/include/storage/subsystemlist.h && continue
test "$f" = src/include/tcop/cmdtaglist.h && continue
test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue
test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue
--
2.47.3
[text/x-patch] 0006-Convert-lwlock.c-to-use-the-new-interface.patch (6.5K, 7-0006-Convert-lwlock.c-to-use-the-new-interface.patch)
download | inline diff:
From 76e18bfbd1a7ef566c5ef70d12a8f483d8afabaa Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:18:05 +0300
Subject: [PATCH 06/14] Convert lwlock.c to use the new interface
It seems like a good candidate to convert first because it needs to
initialized before any other subsystem, but other than that it's
nothing special.
---
src/backend/storage/ipc/ipci.c | 13 -----
src/backend/storage/lmgr/lwlock.c | 74 ++++++++++++++++-------------
src/include/storage/lwlock.h | 2 -
src/include/storage/subsystemlist.h | 9 +++-
4 files changed, 48 insertions(+), 50 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 67ab4a78192..5d639e6bada 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -120,7 +120,6 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, LWLockShmemSize());
size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, SharedInvalShmemSize());
@@ -178,11 +177,6 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
- /*
- * Attach to LWLocks first. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
CreateOrAttachShmemStructs();
@@ -229,13 +223,6 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /*
- * Initialize LWLocks first, in case any of the shmem init function use
- * LWLocks. (Nothing else can be running during startup, so they don't
- * need to do any locking yet, but we nevertheless allow it.)
- */
- LWLockShmemInit();
-
/* Initialize all shmem areas */
ShmemInitRequested();
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 5cb696490d6..b7e2c4ae540 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -84,6 +84,7 @@
#include "storage/proclist.h"
#include "storage/procnumber.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -212,6 +213,15 @@ typedef struct NamedLWLockTrancheRequest
static List *NamedLWLockTrancheRequests = NIL;
+static void LWLockShmemRequest(void *arg);
+static void LWLockShmemInit(void *arg);
+
+const ShmemCallbacks LWLockCallbacks = {
+ .request_fn = LWLockShmemRequest,
+ .init_fn = LWLockShmemInit,
+};
+
+
static void InitializeLWLocks(int numLocks);
static inline void LWLockReportWaitStart(LWLock *lock);
static inline void LWLockReportWaitEnd(void);
@@ -401,58 +411,54 @@ NumLWLocksForNamedTranches(void)
}
/*
- * Compute shmem space needed for user-defined tranches and the main LWLock
- * array.
+ * Request shmem space for user-defined tranches and the main LWLock array.
*/
-Size
-LWLockShmemSize(void)
+static void
+LWLockShmemRequest(void *arg)
{
- Size size;
+ static ShmemStructDesc LWLockTrancheShmemDesc;
int numLocks;
+ Size size;
+
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
/* Space for user-defined tranches */
size = sizeof(LWLockTrancheShmemData);
-
- /* Space for the LWLock array */
- numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
size = add_size(size, mul_size(numLocks, sizeof(LWLockPadded)));
+ ShmemRequestStruct(&LWLockTrancheShmemDesc,
+ .name = "LWLock tranches",
+ .size = size,
+ .ptr = (void **) &LWLockTranches,
+ );
- return size;
+ /* Space for the LWLock array */
+ ShmemRequestStruct(&LWLockTrancheShmemDesc,
+ .name = "Main LWLock array",
+ .size = numLocks * sizeof(LWLockPadded),
+ .ptr = (void **) &MainLWLockArray,
+ );
}
/*
- * Allocate shmem space for user-defined tranches and the main LWLock array,
- * and initialize it.
+ * Initialize shmem space for user-defined tranches and the main LWLock array.
*/
-void
-LWLockShmemInit(void)
+static void
+LWLockShmemInit(void *arg)
{
int numLocks;
- bool found;
- LWLockTranches = (LWLockTrancheShmemData *)
- ShmemInitStruct("LWLock tranches", sizeof(LWLockTrancheShmemData), &found);
- if (!found)
- {
- /* Calculate total number of locks needed in the main array */
- LWLockTranches->num_main_array_locks =
- NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
- /* Initialize the dynamic-allocation counter for tranches */
- LWLockTranches->num_user_defined = 0;
+ /* Remember total number of locks needed in the main array */
+ LWLockTranches->num_main_array_locks = numLocks;
- SpinLockInit(&LWLockTranches->lock);
- }
+ /* Initialize the dynamic-allocation counter for tranches */
+ LWLockTranches->num_user_defined = 0;
- /* Allocate and initialize the main array */
- numLocks = LWLockTranches->num_main_array_locks;
- MainLWLockArray = (LWLockPadded *)
- ShmemInitStruct("Main LWLock array", numLocks * sizeof(LWLockPadded), &found);
- if (!found)
- {
- /* Initialize all LWLocks */
- InitializeLWLocks(numLocks);
- }
+ SpinLockInit(&LWLockTranches->lock);
+
+ /* Allocate and initialize all LWLocks in the main array */
+ InitializeLWLocks(numLocks);
}
/*
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 61f0dbe749a..efa5b427e9f 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -126,8 +126,6 @@ extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
-extern Size LWLockShmemSize(void);
-extern void LWLockShmemInit(void);
extern void InitLWLockAccess(void);
extern const char *GetLWLockIdentifier(uint32 classId, uint16 eventId);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..f0cf01f5a85 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,11 @@
* of these matter.
*/
-/* TODO: empty for now */
+/*
+ * LWLocks first, in case any of the other shmem init functions use LWLocks.
+ * (Nothing else can be running during startup, so they don't need to do any
+ * locking yet, but we nevertheless allow it.)
+ */
+PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
+
+/* TODO: nothing else for now */
--
2.47.3
[text/x-patch] 0007-Use-the-new-mechanism-in-a-few-core-subsystems.patch (47.2K, 8-0007-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From b00e09ff1713f78ea7a65b8c290ad552a60563cb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:21:17 +0300
Subject: [PATCH 07/14] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 36 ++---
src/backend/port/posix_sema.c | 25 ++--
src/backend/port/sysv_sema.c | 24 ++--
src/backend/port/win32_sema.c | 11 +-
src/backend/storage/ipc/dsm.c | 65 +++++----
src/backend/storage/ipc/dsm_registry.c | 39 ++---
src/backend/storage/ipc/ipci.c | 27 ----
src/backend/storage/ipc/latch.c | 8 +-
src/backend/storage/ipc/pmsignal.c | 57 ++++----
src/backend/storage/ipc/procarray.c | 119 ++++++++--------
src/backend/storage/ipc/procsignal.c | 66 ++++-----
src/backend/storage/ipc/sinvaladt.c | 40 +++---
src/backend/storage/lmgr/proc.c | 190 +++++++++++++------------
src/backend/utils/hash/dynahash.c | 3 +-
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pg_sema.h | 6 +-
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 17 ++-
25 files changed, 375 insertions(+), 378 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..2ea8d088c0e 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,28 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc TransamVariablesShmemDesc;
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(&TransamVariablesShmemDesc,
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 40205b7d400..97148da9b63 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -159,22 +159,27 @@ PosixSemaphoreKill(sem_t *sem)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
#ifdef USE_NAMED_POSIX_SEMAPHORES
/* No shared memory needed in this case */
- return 0;
#else
+ static ShmemStructDesc sharedSemasShmemDesc;
+
/* Need a PGSemaphoreData per semaphore */
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ ShmemRequestStruct(&sharedSemasShmemDesc,
+ .name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
#endif
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -193,10 +198,9 @@ PGSemaphoreShmemSize(int maxSemas)
* we don't have to expose the counters to other processes.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -214,11 +218,6 @@ PGReserveSemaphores(int maxSemas)
mySemPointers = (sem_t **) malloc(maxSemas * sizeof(sem_t *));
if (mySemPointers == NULL)
elog(PANIC, "out of memory");
-#else
-
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
#endif
numSems = 0;
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 4b2bf84072f..f927026442c 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -301,16 +301,23 @@ IpcSemaphoreCreate(int numSems)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ static ShmemStructDesc sharedSemasShmemDesc;
+
+ /* Need a PGSemaphoreData per semaphore */
+ ShmemRequestStruct(&sharedSemasShmemDesc,
+ .name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -327,10 +334,9 @@ PGSemaphoreShmemSize(int maxSemas)
* have clobbered.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -344,10 +350,6 @@ PGReserveSemaphores(int maxSemas)
errmsg("could not stat data directory \"%s\": %m",
DataDir)));
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
-
numSharedSemas = 0;
maxSharedSemas = maxSemas;
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index ba97c9b2d64..a3202554769 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -25,17 +25,16 @@ static void ReleaseSemaphores(int code, Datum arg);
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
/* No shared memory needed on Windows */
- return 0;
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* In the Win32 implementation, we acquire semaphores on-demand; the
* maxSemas parameter is just used to size the array that keeps track of
@@ -44,7 +43,7 @@ PGSemaphoreShmemSize(int maxSemas)
* process exits.
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
mySemSet = (HANDLE *) malloc(maxSemas * sizeof(HANDLE));
if (mySemSet == NULL)
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..923593d140d 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -110,6 +111,14 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
+
/*
* List of dynamic shared memory segments used by this backend.
*
@@ -463,43 +472,45 @@ dsm_set_control_handle(dsm_handle h)
}
#endif
+static ShmemStructDesc dsm_main_space_shmem_desc;
+
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ size_t size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (size == 0)
+ return;
+
+ ShmemRequestStruct(&dsm_main_space_shmem_desc,
+ .name = "Preallocated DSM",
+ .size = size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..3e9c0ba2947 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,23 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ static ShmemStructDesc DSMRegistryCtxShmemDesc;
+
+ ShmemRequestStruct(&DSMRegistryCtxShmemDesc,
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 5d639e6bada..b8ce4de9ab5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -42,15 +41,11 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -104,14 +99,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -120,11 +111,7 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -277,13 +264,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -306,23 +289,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 8537e9fef2d..7d4f4cf32bb 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -80,10 +80,10 @@ InitLatch(Latch *latch)
* current process.
*
* InitSharedLatch needs to be called in postmaster before forking child
- * processes, usually right after allocating the shared memory block
- * containing the latch with ShmemInitStruct. (The Unix implementation
- * doesn't actually require that, but the Windows one does.) Because of
- * this restriction, we have no concurrency issues to worry about here.
+ * processes, usually right after initializing the shared memory block
+ * containing the latch. (The Unix implementation doesn't actually require
+ * that, but the Windows one does.) Because of this restriction, we have no
+ * concurrency issues to worry about here.
*
* Note that other handles created in this module are never marked as
* inheritable. Thus we do not need to worry about cleaning up child
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..00588664885 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,31 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
-
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
-
- return size;
+ static ShmemStructDesc PMSignalShmemDesc;
+ size_t size;
+
+ num_child_flags = MaxLivePostmasterChildren();
+
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(&PMSignalShmemDesc,
+ .name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..c8e09f4f0b6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,20 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
+static ShmemStructDesc ProcArrayShmemDesc;
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +284,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +294,15 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc;
+
static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +393,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +418,52 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(&KnownAssignedXidsShmemDesc,
+ .name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(&KnownAssignedXidsValidShmemDesc,
+ .name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(&ProcArrayShmemDesc,
+ .name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index f1ab3aa3fe0..dc2e325076a 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -106,7 +107,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -114,51 +124,41 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
+ static ShmemStructDesc ProcSignalShmemDesc;
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(&ProcSignalShmemDesc,
+ .name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..34860d474bc 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,37 +221,32 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
+ static ShmemStructDesc SharedInvalShmemDesc;
Size size;
size = offsetof(SISeg, procState);
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(&SharedInvalShmemDesc,
+ .name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..b6e4515475c 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,25 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *tmpAllProcs;
+static void *tmpFastPathLockArray;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalShmemDesc;
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc;
+static ShmemStructDesc FastPathLockArrayShmemDesc;
+
+static uint32 TotalProcs;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -82,24 +99,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static DeadLockState CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -107,8 +106,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +125,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +138,66 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ShmemRequestStruct(&ProcGlobalAllProcsShmemDesc,
+ .name = "PGPROC structures",
+ .size = size,
+ .ptr = (void **) &tmpAllProcs,
+ );
+
+ ShmemRequestStruct(&FastPathLockArrayShmemDesc,
+ .name = "Fast-Path Lock Array",
+ .size = IsUnderPostmaster ? SHMEM_ATTACH_UNKNOWN_SIZE : FastPathLockShmemSize(),
+ .ptr = (void **) &tmpFastPathLockArray,
+ );
+
+ ShmemRequestStruct(&ProcGlobalShmemDesc,
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ .ptr = (void **) &ProcGlobal,
+ );
+
+ /* Let the semaphore implementation register its shared memory needs */
+ PGSemaphoreShmemRequest(ProcGlobalSemas());
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +216,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +245,12 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +259,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,30 +274,27 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
+ /* Initialize semaphores */
+ PGSemaphoreInit(ProcGlobalSemas());
for (i = 0; i < TotalProcs; i++)
{
@@ -405,7 +415,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +470,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +603,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +672,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index d49a7a92c64..81199edca86 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -338,7 +338,8 @@ string_compare(const char *key1, const char *key2, Size keysize)
* under info->hcxt rather than under TopMemoryContext; the default
* behavior is only suitable for session-lifespan hash tables.
* Other flags bits are special-purpose and seldom used, except for those
- * associated with shared-memory hash tables, for which see ShmemInitHash().
+ * associated with shared-memory hash tables, for which see
+ * ShmemRequestHash().
*
* Fields in *info are read only when the associated flags bit is set.
* It is not necessary to initialize other fields of *info.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 66facc6907a..fe50ee505ba 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -37,11 +37,11 @@ typedef HANDLE PGSemaphore;
#endif
-/* Report amount of shared memory needed */
-extern Size PGSemaphoreShmemSize(int maxSemas);
+/* Request shared memory needed for semaphores */
+extern void PGSemaphoreShmemRequest(int maxSemas);
/* Module initialization (called during postmaster start or shmem reinit) */
-extern void PGReserveSemaphores(int maxSemas);
+extern void PGSemaphoreInit(int maxSemas);
/* Allocate a PGSemaphore structure with initial count 1 */
extern PGSemaphore PGSemaphoreCreate(void);
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 1dad125706e..60732ccb33a 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -551,8 +551,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index cc4f26aa33d..7f855971b5a 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -67,9 +67,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index f0cf01f5a85..d62c29f1361 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -27,4 +27,19 @@
*/
PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
-/* TODO: nothing else for now */
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
--
2.47.3
[text/x-patch] 0008-refactor-predicate.c-inline-SerialInit-to-the-caller.patch (3.6K, 9-0008-refactor-predicate.c-inline-SerialInit-to-the-caller.patch)
download | inline diff:
From 21730d27c45ea9720a03e56a8f5a60f4244e4a85 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH 08/14] refactor predicate.c: inline SerialInit to the caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e003fa5b107..13a6a4b93a6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1355,7 +1312,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.47.3
[text/x-patch] 0009-refactor-predicate.c-Move-all-the-initialization-tog.patch (8.3K, 10-0009-refactor-predicate.c-Move-all-the-initialization-tog.patch)
download | inline diff:
From c7b2a67a976ccb4cf3b3cb38e0e171b78b40f0f5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH 09/14] refactor predicate.c: Move all the initialization
together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 13a6a4b93a6..af03071a71f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1144,19 +1144,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1200,49 +1187,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1278,23 +1222,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1305,8 +1232,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1328,19 +1253,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.47.3
[text/x-patch] 0010-Convert-SLRUs-to-use-the-new-interface.patch (84.7K, 11-0010-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From f7e9203d364543b151ed5a69d8d84776e18efdc1 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:32:45 +0300
Subject: [PATCH 10/14] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 92 +++---
src/backend/access/transam/multixact.c | 140 +++++----
src/backend/access/transam/slru.c | 364 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 117 ++++----
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 288 +++++++++---------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 108 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem.h | 1 +
src/include/storage/subsystemlist.h | 10 +
src/test/modules/test_slru/test_slru.c | 110 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 718 insertions(+), 668 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..87f7f5707de 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(&XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..236d8fb4baa 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,9 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
+
+static ShmemStructDesc CommitTsShmemDesc;
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +128,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +524,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +550,37 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(&CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
+
+ .nslots = CommitTsShmemBuffers(),
+
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
+
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
+
+ ShmemRequestStruct(&CommitTsShmemDesc,
+ .name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
-
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
-
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..940ac5a78d6 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,83 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
+ static ShmemStructDesc MultiXactShmemDesc;
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(&MultiXactShmemDesc,
+ .name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(&MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(&MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..3fe60c5804b 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -71,6 +71,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +90,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +102,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +111,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +177,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +197,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +239,134 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&desc->base, &options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) base_desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) desc->base.ptr;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +392,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +445,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +461,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +487,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +545,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +649,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +696,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +776,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +790,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +848,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +920,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1052,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1092,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1107,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1122,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1135,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1214,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1306,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1318,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1367,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1437,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1453,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1475,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1503,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1548,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1571,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1648,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1669,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1688,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1745,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1760,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1776,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1792,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1806,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1839,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1847,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1861,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1879,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..ca273fb4680 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index e91a62ff42a..9cd27695787 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,65 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
+ static ShmemStructDesc AsyncQueueControlShmemDesc;
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(&AsyncQueueControlShmemDesc,
+ .name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(&NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index b8ce4de9ab5..faaf9c471f2 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
@@ -123,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
@@ -270,10 +264,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -281,11 +271,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -313,7 +298,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 606b545e8fe..edcfecf24ba 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -134,6 +134,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -556,6 +557,9 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
case SHMEM_KIND_HASH:
shmem_hash_attach(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(desc, request->options);
+ break;
}
}
else if (!may_init)
@@ -615,6 +619,9 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
case SHMEM_KIND_HASH:
shmem_hash_init(desc, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(desc, request->options);
+ break;
}
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index af03071a71f..02dbbf30950 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,16 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static ShmemStructDesc PredXactListShmemDesc;
+
+static int64 max_serializable_xacts;
+
+static ShmemStructDesc RWConflictPoolShmemDesc;
+
+static ShmemStructDesc FinishedSerializableShmemDesc;
+
+static ShmemStructDesc SerialControlShmemDesc;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +464,18 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
+static ShmemHashDesc SerializableXidHashDesc;
+
+static ShmemHashDesc PredicateLockTargetHashDesc;
+
+static ShmemHashDesc PredicateLockHashDesc;
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,71 +1127,58 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
+ ShmemRequestHash(&PredicateLockTargetHashDesc,
+ .name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
-
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(&PredicateLockHashDesc,
+ .name = "PREDICATELOCK hash",
+
+ .nelems = max_predicate_locks,
+
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1177,29 +1191,29 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&PredXactListShmemDesc,
+ .name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
-
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ ShmemRequestHash(&SerializableXidHashDesc,
+ .name = "SERIALIZABLEXID hash",
+ .nelems = max_serializable_xacts,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1214,58 +1228,53 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(&RWConflictPoolShmemDesc,
+ .name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&FinishedSerializableShmemDesc,
+ .name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(&SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(&SerialControlShmemDesc,
+ .name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1277,7 +1286,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1319,6 +1327,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1335,57 +1346,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
-
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..820c7986854 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,20 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruOpts
{
- SlruShared shared;
-
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ ShmemStructOpts base;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * name of SLRU. (This is user-visible, pick with care!)
*/
- bool long_segment_names;
+ const char *name;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +128,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +164,28 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
+
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ ShmemStructDesc base;
+
+ SlruOpts options;
-typedef SlruCtlData *SlruCtl;
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +194,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options);
+
+#define SimpleLruRequest(desc, ...) \
+ SimpleLruRequestWithOpts(desc, &(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+extern void shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index f57672fbc58..42030145328 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -29,6 +29,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemAreaKind;
/*
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d62c29f1361..c199f18a27a 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -43,3 +50,6 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..3c2a143b4d5 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,48 +212,6 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
-{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /*
- * Create the SLRU directory if it does not exist yet, from the root of
- * the data directory.
- */
- (void) MakePGDirectory(slru_dir_name);
-
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
-}
-
void
_PG_init(void)
{
@@ -269,9 +221,37 @@ _PG_init(void)
errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
"test_slru")));
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ /*
+ * Create the SLRU directory if it does not exist yet, from the root of
+ * the data directory.
+ */
+ (void) MakePGDirectory(TestSlruDir);
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
+}
+
+static void
+test_slru_shmem_request(void *arg)
+{
+ SimpleLruRequest(&TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
+
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5358229b73f..42049d7612e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2901,10 +2901,10 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
SlruPageStatus
+SlruRequestOpts
SlruScanCallback
SlruSegState
SlruShared
--
2.47.3
[text/x-patch] 0011-Convert-AIO-to-the-new-interface.patch (31.3K, 12-0011-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From 69e66d04061a3804181015c543e3ac19a77793b6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH 11/14] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/access/transam/clog.c | 20 ++--
src/backend/access/transam/commit_ts.c | 24 ++---
src/backend/access/transam/multixact.c | 42 ++++----
src/backend/access/transam/slru.c | 8 +-
src/backend/access/transam/subtrans.c | 20 ++--
src/backend/commands/async.c | 30 +++---
src/backend/storage/aio/aio_init.c | 121 ++++++++++++++--------
src/backend/storage/aio/method_io_uring.c | 42 +++++---
src/backend/storage/aio/method_worker.c | 86 ++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/backend/storage/lmgr/predicate.c | 89 ++++++++--------
src/include/access/slru.h | 6 +-
src/include/storage/aio_internal.h | 16 +--
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
src/test/modules/test_slru/test_slru.c | 40 +++----
16 files changed, 296 insertions(+), 257 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 87f7f5707de..95f160879e0 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -810,19 +810,19 @@ CLOGShmemRequest(void *arg)
}
Assert(transaction_buffers != 0);
SimpleLruRequest(&XactSlruDesc,
- .name = "transaction",
- .Dir = "pg_xact",
- .long_segment_names = false,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
- .nslots = CLOGShmemBuffers(),
- .nlsns = CLOG_LSNS_PER_PAGE,
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
- .sync_handler = SYNC_HANDLER_CLOG,
- .PagePrecedes = CLOGPagePrecedes,
- .errdetail_for_io_error = clog_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
- .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
);
}
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 236d8fb4baa..675dac9e40f 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -551,24 +551,24 @@ CommitTsShmemRequest(void *arg)
}
Assert(commit_timestamp_buffers != 0);
SimpleLruRequest(&CommitTsSlruDesc,
- .name = "commit_timestamp",
- .Dir = "pg_commit_ts",
- .long_segment_names = false,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- .nslots = CommitTsShmemBuffers(),
+ .nslots = CommitTsShmemBuffers(),
- .PagePrecedes = CommitTsPagePrecedes,
- .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- .sync_handler = SYNC_HANDLER_COMMIT_TS,
- .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
- .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
);
ShmemRequestStruct(&CommitTsShmemDesc,
- .name = "CommitTs shared",
- .size = sizeof(CommitTimestampShared),
- .ptr = (void **) &commitTsShared,
+ .name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
);
}
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 940ac5a78d6..88e46d6868d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1779,39 +1779,39 @@ MultiXactShmemRequest(void *arg)
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
ShmemRequestStruct(&MultiXactShmemDesc,
- .name = "Shared MultiXact State",
- .size = size,
- .ptr = (void **) &MultiXactState,
+ .name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
);
SimpleLruRequest(&MultiXactOffsetSlruDesc,
- .name = "multixact_offset",
- .Dir = "pg_multixact/offsets",
- .long_segment_names = false,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- .nslots = multixact_offset_buffers,
+ .nslots = multixact_offset_buffers,
- .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
- .PagePrecedes = MultiXactOffsetPagePrecedes,
- .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
- .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
);
SimpleLruRequest(&MultiXactMemberSlruDesc,
- .name = "multixact_member",
- .Dir = "pg_multixact/members",
- .long_segment_names = true,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- .nslots = multixact_member_buffers,
+ .nslots = multixact_member_buffers,
- .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
- .PagePrecedes = MultiXactMemberPagePrecedes,
- .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
);
/*
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 3fe60c5804b..6d9dda6b29b 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -242,9 +242,9 @@ SimpleLruAutotuneBuffers(int divisor, int max)
* Register a simple LRU cache in shared memory.
*/
void
-SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
+SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts * options)
{
- SlruOpts *options_copy;
+ SlruOpts *options_copy;
Assert(options->name != NULL);
Assert(options->nslots > 0);
@@ -265,7 +265,7 @@ SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options)
void
shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
{
- SlruOpts *options = (SlruOpts *) base_options;
+ SlruOpts *options = (SlruOpts *) base_options;
SlruDesc *desc = (SlruDesc *) base_desc;
char namebuf[NAMEDATALEN];
SlruShared shared;
@@ -356,7 +356,7 @@ shmem_slru_init(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
void
shmem_slru_attach(ShmemStructDesc *base_desc, ShmemStructOpts *base_options)
{
- SlruOpts *options = (SlruOpts *) base_options;
+ SlruOpts *options = (SlruOpts *) base_options;
SlruDesc *desc = (SlruDesc *) base_desc;
int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index ca273fb4680..5f68c3f0cca 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -244,19 +244,19 @@ SUBTRANSShmemRequest(void *arg)
Assert(subtransaction_buffers != 0);
SimpleLruRequest(&SubTransSlruDesc,
- .name = "subtransaction",
- .Dir = "pg_subtrans",
- .long_segment_names = false,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
- .nslots = SUBTRANSShmemBuffers(),
+ .nslots = SUBTRANSShmemBuffers(),
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = SubTransPagePrecedes,
- .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
- .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
- );
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
}
static void
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 9cd27695787..b8345295f0d 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -804,27 +804,27 @@ AsyncShmemRequest(void *arg)
size = add_size(size, offsetof(AsyncQueueControl, backend));
ShmemRequestStruct(&AsyncQueueControlShmemDesc,
- .name = "Async Queue Control",
- .size = size,
- .ptr = (void **) &asyncQueueControl,
- );
+ .name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
SimpleLruRequest(&NotifySlruDesc,
- .name = "notify",
- .Dir = "pg_notify",
+ .name = "notify",
+ .Dir = "pg_notify",
- /* long segment names are used in order to avoid wraparound */
- .long_segment_names = true,
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- .nslots = notify_buffers,
+ .nslots = notify_buffers,
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = asyncQueuePagePrecedes,
- .errdetail_for_io_error = asyncQueueErrdetailForIoError,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
- .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
- );
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
}
static void
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..9efe53912ec 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,12 +117,21 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
+ static ShmemStructDesc AioCtlShmemDesc;
+ static ShmemStructDesc AioBackendShmemDesc;
+ static ShmemStructDesc AioHandleShmemDesc;
+ static ShmemStructDesc AioHandleIOVShmemDesc;
+ static ShmemStructDesc AioHandleDataShmemDesc;
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -132,48 +149,57 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(&AioCtlShmemDesc,
+ .name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(&AioBackendShmemDesc,
+ .name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleShmemDesc,
+ .name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleIOVShmemDesc,
+ .name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(&AioHandleDataShmemDesc,
+ .name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +234,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 39984df31b4..82fbbefc0b6 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,34 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioUringShmemDesc;
+
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ ShmemRequestStruct(&AioUringShmemDesc,
+ .name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +301,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index efe38e9f113..7697ae34d66 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,62 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ pgaio_worker_set_secondary_ptr();
+
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ static ShmemStructDesc AioWorkerShmemDesc;
+ size_t size;
+ int queue_size;
+
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+
+ ShmemRequestStruct(&AioWorkerShmemDesc,
+ .name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index faaf9c471f2..bd67f81cea3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -122,7 +122,6 @@ CalculateShmemSize(void)
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
size = add_size(size, DataChecksumsShmemSize());
@@ -301,7 +300,6 @@ CreateOrAttachShmemStructs(void)
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 02dbbf30950..f1d38338c83 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1149,15 +1149,14 @@ PredicateLockShmemRequest(void *arg)
* per-predicate-lock-target information.
*/
ShmemRequestHash(&PredicateLockTargetHashDesc,
- .name = "PREDICATELOCKTARGET hash",
- .nelems = max_predicate_lock_targets,
-
- .ptr = &PredicateLockTargetHash,
- .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
- .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
- .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
- .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
- );
+ .name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
@@ -1168,16 +1167,14 @@ PredicateLockShmemRequest(void *arg)
max_predicate_locks = max_predicate_lock_targets * 2;
ShmemRequestHash(&PredicateLockHashDesc,
- .name = "PREDICATELOCK hash",
-
- .nelems = max_predicate_locks,
-
- .ptr = &PredicateLockHash,
- .hash_info.keysize = sizeof(PREDICATELOCKTAG),
- .hash_info.entrysize = sizeof(PREDICATELOCK),
- .hash_info.hash = predicatelock_hash,
- .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
- .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ .name = "PREDICATELOCK hash",
+ .nelems = max_predicate_locks,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
);
/*
@@ -1195,11 +1192,11 @@ PredicateLockShmemRequest(void *arg)
* predicate locking.
*/
ShmemRequestStruct(&PredXactListShmemDesc,
- .name = "PredXactList",
- .size = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)))),
- .ptr = (void **) &PredXact,
+ .name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
);
/*
@@ -1229,45 +1226,45 @@ PredicateLockShmemRequest(void *arg)
max_rw_conflicts = max_serializable_xacts * 5;
ShmemRequestStruct(&RWConflictPoolShmemDesc,
- .name = "RWConflictPool",
- .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
- RWConflictDataSize),
- .ptr = (void **) &RWConflictPool,
- );
+ .name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
ShmemRequestStruct(&FinishedSerializableShmemDesc,
- .name = "FinishedSerializableTransactions",
- .size = sizeof(dlist_head),
- .ptr = (void **) &FinishedSerializableTransactions,
- );
+ .name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
SimpleLruRequest(&SerialSlruDesc,
- .name = "serializable",
- .Dir = "pg_serial",
- .long_segment_names = false,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
- .nslots = serializable_buffers,
+ .nslots = serializable_buffers,
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = SerialPagePrecedesLogically,
- .errdetail_for_io_error = serial_errdetail_for_io_error,
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
- .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
- .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
);
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
ShmemRequestStruct(&SerialControlShmemDesc,
- .name = "SerialControlData",
- .size = sizeof(SerialControlData),
- .ptr = (void **) &serialControl,
- );
+ .name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
}
static void
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index 820c7986854..1dbb0b62525 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -169,7 +169,7 @@ typedef struct SlruOpts
*/
int buffer_tranche_id;
int bank_tranche_id;
-} SlruOpts;
+} SlruOpts;
/*
* SlruDesc is an unshared structure that points to the active information
@@ -179,7 +179,7 @@ typedef struct SlruDesc
{
ShmemStructDesc base;
- SlruOpts options;
+ SlruOpts options;
SlruShared shared;
@@ -203,7 +203,7 @@ SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts *options);
+extern void SimpleLruRequestWithOpts(SlruDesc *desc, const SlruOpts * options);
#define SimpleLruRequest(desc, ...) \
SimpleLruRequestWithOpts(desc, &(SlruOpts){__VA_ARGS__})
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index c199f18a27a..b438794d46d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -53,3 +53,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index 3c2a143b4d5..6bd1bec72c5 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -234,24 +234,24 @@ static void
test_slru_shmem_request(void *arg)
{
SimpleLruRequest(&TestSlruDesc,
- .name = "TestSLRU",
- .Dir = TestSlruDir,
-
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- .long_segment_names = true,
-
- .nslots = NUM_TEST_BUFFERS,
- .nlsns = 0,
-
- .sync_handler = SYNC_HANDLER_NONE,
- .PagePrecedes = test_slru_page_precedes_logically,
- .errdetail_for_io_error = test_slru_errdetail_for_io_error,
-
- /* let slru.c assign these */
- .buffer_tranche_id = 0,
- .bank_tranche_id = 0,
- );
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
+
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
--
2.47.3
[text/x-patch] 0012-Add-option-for-aligning-shmem-allocations.patch (3.9K, 13-0012-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From f13ccfd73aa930ec6351f6a8018d7c9be48f481f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH 12/14] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 22 +++++++++++++---------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index edcfecf24ba..c318a80becc 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -239,7 +239,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -399,6 +399,7 @@ ShmemGetRequestedSize(void)
foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
size = add_size(size, request->options->size);
+ size = add_size(size, request->options->alignment);
}
return size;
@@ -588,7 +589,7 @@ AttachOrInitShmemIndexEntry(ShmemRequest *request,
size_t allocated_size;
void *structPtr;
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size, request->options->alignment, &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -768,7 +769,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -787,7 +788,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -797,8 +798,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -814,14 +816,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -835,8 +838,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 42030145328..7e3c9c7f277 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -63,6 +63,12 @@ typedef struct ShmemStructOpts
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
--
2.47.3
[text/x-patch] 0013-Convert-buffer-manager-to-the-new-API.patch (16.0K, 14-0013-Convert-buffer-manager-to-the-new-API.patch)
download | inline diff:
From 99610b50f699bbf8e953c793499cf55db36a2439 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:44:02 +0300
Subject: [PATCH 13/14] Convert buffer manager to the new API
---
src/backend/storage/buffer/buf_init.c | 158 +++++++++++--------------
src/backend/storage/buffer/buf_table.c | 56 +++++----
src/backend/storage/buffer/freelist.c | 94 +++++----------
src/backend/storage/ipc/ipci.c | 3 -
src/include/storage/buf_internals.h | 5 -
src/include/storage/bufmgr.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 137 insertions(+), 186 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..71e7c29116c 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,39 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ static ShmemStructDesc BufferDescriptorsShmemDesc;
+ static ShmemStructDesc BufferBlocksShmemDesc;
+ static ShmemStructDesc BufferIOCVArrayShmemDesc;
+ static ShmemStructDesc CkptBufferIdsShmemDesc;
+
+ ShmemRequestStruct(&BufferDescriptorsShmemDesc,
+ .name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(&BufferBlocksShmemDesc,
+ .name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(&BufferIOCVArrayShmemDesc,
+ .name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +112,51 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(&CkptBufferIdsShmemDesc,
+ .name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index d04ef74b850..aaf66351220 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -22,6 +22,7 @@
#include "postgres.h"
#include "storage/buf_internals.h"
+#include "storage/subsystems.h"
/* entry for buffer lookup hashtable */
typedef struct
@@ -32,37 +33,44 @@ typedef struct
static HTAB *SharedBufHash;
+static void BufTableShmemRequest(void *arg);
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
+const ShmemCallbacks BufTableShmemCallbacks = {
+ .request_fn = BufTableShmemRequest,
+ /* no special initialization needed, the hash table will start empty */
+};
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(void *arg)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ static ShmemHashDesc SharedBufHashDesc;
+ int size;
+
+ /*
+ * Request the shared buffer lookup hashtable.
+ *
+ * Since we can't tolerate running out of lookup table entries, we must be
+ * sure to specify an adequate table size here. The maximum steady-state
+ * usage is of course NBuffers entries, but BufferAlloc() tries to insert
+ * a new entry before deleting the old. In principle this could be
+ * happening in each partition concurrently, so we could need as many as
+ * NBuffers + NUM_BUFFER_PARTITIONS entries.
+ */
+ size = NBuffers + NUM_BUFFER_PARTITIONS;
+
+ ShmemRequestHash(&SharedBufHashDesc,
+ .name = "Shared Buffer Lookup Table",
+ .nelems = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..313a5e29f57 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,80 +379,38 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- request shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
+ static ShmemStructDesc StrategyCtlShmemDesc;
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
-
- return size;
+ ShmemRequestStruct(&StrategyCtlShmemDesc,
+ .name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
}
/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
*/
-void
-StrategyInitialize(bool init)
+static void
+StrategyCtlShmemInit(void *arg)
{
- bool found;
-
- /*
- * Initialize the shared buffer lookup hashtable.
- *
- * Since we can't tolerate running out of lookup table entries, we must be
- * sure to specify an adequate table size here. The maximum steady-state
- * usage is of course NBuffers entries, but BufferAlloc() tries to insert
- * a new entry before deleting the old. In principle this could be
- * happening in each partition concurrently, so we could need as many as
- * NBuffers + NUM_BUFFER_PARTITIONS entries.
- */
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
-
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index bd67f81cea3..ecc5388ad47 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -39,7 +39,6 @@
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
@@ -99,7 +98,6 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
@@ -263,7 +261,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- BufferManagerShmemInit();
/*
* Set up lock manager
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..89615a254a3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,7 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index b438794d46d..d8e11756a61 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -36,6 +36,9 @@ PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
--
2.47.3
[text/x-patch] 0014-Convert-all-remaining-subsystems-to-use-the-new-API.patch (113.4K, 15-0014-Convert-all-remaining-subsystems-to-use-the-new-API.patch)
download | inline diff:
From 0fcca436d6941a250816b9c777aa7c5f9d3240dd Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH 14/14] Convert all remaining subsystems to use the new API
---
src/backend/access/common/syncscan.c | 79 ++++----
src/backend/access/nbtree/nbtutils.c | 56 +++---
src/backend/access/transam/twophase.c | 77 +++----
src/backend/access/transam/xlog.c | 86 ++++----
src/backend/access/transam/xlogprefetcher.c | 54 ++---
src/backend/access/transam/xlogrecovery.c | 36 ++--
src/backend/access/transam/xlogwait.c | 52 ++---
src/backend/postmaster/autovacuum.c | 81 ++++----
src/backend/postmaster/bgworker.c | 107 +++++-----
src/backend/postmaster/checkpointer.c | 58 +++---
src/backend/postmaster/datachecksum_state.c | 42 ++--
src/backend/postmaster/pgarch.c | 46 +++--
src/backend/postmaster/walsummarizer.c | 63 +++---
src/backend/replication/logical/launcher.c | 58 +++---
src/backend/replication/logical/logicalctl.c | 30 +--
src/backend/replication/logical/origin.c | 61 +++---
src/backend/replication/logical/slotsync.c | 44 ++--
src/backend/replication/slot.c | 66 +++---
src/backend/replication/walreceiverfuncs.c | 52 ++---
src/backend/replication/walsender.c | 61 +++---
src/backend/storage/ipc/ipci.c | 124 +-----------
src/backend/storage/lmgr/lock.c | 115 +++++------
src/backend/utils/activity/backend_status.c | 190 ++++++++----------
src/backend/utils/activity/pgstat_shmem.c | 161 ++++++++-------
src/backend/utils/activity/wait_event.c | 90 ++++-----
src/backend/utils/misc/injection_point.c | 60 +++---
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/datachecksum_state.h | 4 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 27 +++
src/include/utils/backend_status.h | 8 -
src/include/utils/injection_point.h | 3 -
src/include/utils/wait_event.h | 2 -
.../injection_points/injection_points.c | 60 ++----
src/test/modules/test_aio/test_aio.c | 108 +++++-----
54 files changed, 1020 insertions(+), 1196 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..25522284faa 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,50 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ static ShmemStructDesc SyncScanShmemDesc;
+
+ ShmemRequestStruct(&SyncScanShmemDesc,
+ .name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..ecd4fb4df6f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,39 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
+ static ShmemStructDesc BTreeShmemDesc;
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(&BTreeShmemDesc,
+ .name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..f07cdae0325 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,11 +243,12 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
+ static ShmemStructDesc TwoPhaseShmemDesc;
Size size;
/* Need the fixed struct, the array of pointers, and the GTD structs */
@@ -248,46 +258,41 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(&TwoPhaseShmemDesc,
+ .name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e8999bbb61..4918ee93451 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -96,6 +96,7 @@
#include "storage/procsignal.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -571,6 +572,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -579,6 +590,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -5257,7 +5269,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
SetLocalDataChecksumState(ControlFile->data_checksum_version);
}
@@ -5274,11 +5287,13 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
+ static ShmemStructDesc XLogCtlShmemDesc;
+ static ShmemStructDesc ControlFileShmemDesc;
Size size;
/*
@@ -5317,23 +5332,26 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(&XLogCtlShmemDesc,
+ .name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(&ControlFileShmemDesc,
+ .name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5351,36 +5369,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5442,6 +5441,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..5c2d3cdcdc9 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,28 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ static ShmemStructDesc XLogPrefetchShmemDesc;
+
+ ShmemRequestStruct(&XLogPrefetchShmemDesc,
+ .name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +340,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..e3d04e1f0df 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,23 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
+ static ShmemStructDesc XLogRecoveryShmemDesc;
- return size;
+ ShmemRequestStruct(&XLogRecoveryShmemDesc,
+ .name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..3af9f10133c 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,36 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
+ static ShmemStructDesc WaitLSNShmemDesc;
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(&WaitLSNShmemDesc,
+ .name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..911ea07c2a6 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3545,12 +3554,13 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
+ static ShmemStructDesc AutoVacuumShmemDesc;
Size size;
/*
@@ -3560,53 +3570,42 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(&AutoVacuumShmemDesc,
+ .name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 536aff7ca05..1aa26967381 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -30,6 +30,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -110,6 +111,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -160,77 +169,71 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
+ static ShmemStructDesc BackgroundWorkerShmemDesc;
Size size;
/* Array of workers is variably sized. */
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(&BackgroundWorkerShmemDesc,
+ .name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..da62fcef421 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,12 +959,13 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
+ static ShmemStructDesc CheckpointerShmemDesc;
Size size;
/*
@@ -967,39 +977,25 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(&CheckpointerShmemDesc,
+ .name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index 76004bcedc6..96f04a8bfbc 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -211,6 +211,7 @@
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -346,6 +347,7 @@ static volatile sig_atomic_t launcher_running = false;
static DataChecksumsWorkerOperation operation;
/* Prototypes */
+static void DataChecksumsShmemRequest(void *arg);
static bool DatabaseExists(Oid dboid);
static List *BuildDatabaseList(void);
static List *BuildRelationList(bool temp_relations, bool include_shared);
@@ -356,6 +358,10 @@ static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferA
static void launcher_cancel_handler(SIGNAL_ARGS);
static void WaitForAllTransactionsToFinish(void);
+const ShmemCallbacks DataChecksumsShmemCallbacks = {
+ .request_fn = DataChecksumsShmemRequest,
+};
+
/*****************************************************************************
* Functionality for manipulating the data checksum state in the cluster
*/
@@ -1236,35 +1242,19 @@ ProcessAllDatabases(void)
}
/*
- * DataChecksumStateSize
- * Compute required space for datachecksumsworker-related shared memory
+ * DataChecksumShmemRequest
+ * Request datachecksumsworker-related shared memory
*/
-Size
-DataChecksumsShmemSize(void)
-{
- Size size;
-
- size = sizeof(DataChecksumsStateStruct);
- size = MAXALIGN(size);
-
- return size;
-}
-
-/*
- * DataChecksumStateInit
- * Allocate and initialize datachecksumsworker-related shared memory
- */
-void
-DataChecksumsShmemInit(void)
+static void
+DataChecksumsShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc DataChecksumsStateStructShmemDesc;
- DataChecksumState = (DataChecksumsStateStruct *)
- ShmemInitStruct("DataChecksumsWorker Data",
- DataChecksumsShmemSize(),
- &found);
- if (!found)
- MemSet(DataChecksumState, 0, DataChecksumsShmemSize());
+ ShmemRequestStruct(&DataChecksumsStateStructShmemDesc,
+ .name = "DataChecksumsWorker Data",
+ .size = sizeof(DataChecksumsStateStruct),
+ .ptr = (void **) &DataChecksumState,
+ );
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..888a53e96d0 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,34 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
-
- size = add_size(size, sizeof(PgArchData));
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- return size;
-}
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
{
- bool found;
+ static ShmemStructDesc PgArchShmemDesc;
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
+ ShmemRequestStruct(&PgArchShmemDesc,
+ .name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
+}
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
+{
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..9f87bd0f243 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,37 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ static ShmemStructDesc WalSummarizerShmemDesc;
+
+ ShmemRequestStruct(&WalSummarizerShmemDesc,
+ .name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..f1773279991 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,12 +981,13 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
+ static ShmemStructDesc ApplyLauncherShmemDesc;
Size size;
/*
@@ -987,7 +997,11 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(&ApplyLauncherShmemDesc,
+ .name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1042,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..98617b561df 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,16 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
+ static ShmemStructDesc LogicalDecodingCtlShmemDesc;
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(&LogicalDecodingCtlShmemDesc,
+ .name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..daa984330f1 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,50 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
+ static ShmemStructDesc ReplicationOriginShmemDesc;
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(&ReplicationOriginShmemDesc,
+ .name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..ec8ab1b3d67 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,29 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ static ShmemStructDesc SlotSyncShmemDesc;
+
+ ShmemRequestStruct(&SlotSyncShmemDesc,
+ .name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..ec44c99e04e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,43 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc ReplicationSlotsShmemDesc;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(&ReplicationSlotsShmemDesc,
+ .name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..6d506fc3f43 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,49 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalRcvShmemDesc;
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(&WalRcvShmemDesc,
+ .name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2bb3f34dc6d..b67daaa0a6c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3765,47 +3774,39 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ static ShmemStructDesc WalSndShmemDesc;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(&WalSndShmemDesc,
+ .name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index ecc5388ad47..a7fde605f1b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,41 +14,16 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/datachecksum_state.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
#include "storage/pmsignal.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -57,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -97,33 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
- size = add_size(size, DataChecksumsShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -157,7 +103,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -204,9 +149,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -237,70 +179,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
- DataChecksumsShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitEventCustomShmemInit();
- InjectionPointShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 9dae407981f..3f1724ec6bd 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,21 +443,18 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
+ static ShmemHashDesc LockHashDesc;
+ static ShmemHashDesc ProcLockHashDesc;
+ static ShmemStructDesc FastPathShmemDesc;
int64 max_table_size;
- bool found;
/*
* Compute sizes for lock hashtables. Note that these calculations must
@@ -455,45 +463,51 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
/*
- * Allocate hash table for LOCK structs. This stores per-locked-object
+ * Hash table for LOCK structs. This stores per-locked-object
* information.
*/
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(&LockHashDesc,
+ .name = "LOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_FIXED_SIZE | HASH_PARTITION);
+ ShmemRequestHash(&ProcLockHashDesc,
+ .name = "PROCLOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(&FastPathShmemDesc,
+ .name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin. size =
+ * add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3758,25 +3772,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- return size;
-}
/*
* GetLockStatusData - Return a summary of the lock manager's internal
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..9637c622d6e 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,114 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ static ShmemStructDesc BackendStatusArrayShmemDesc;
+ static ShmemStructDesc BackendAppnameBufferShmemDesc;
+ static ShmemStructDesc BackendClientHostnameBufferShmemDesc;
+ static ShmemStructDesc BackendActivityBufferSizeShmemDesc;
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendSslStatusBufferShmemDesc;
#endif
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ static ShmemStructDesc BackendGssStatusBufferShmemDesc;
+#endif
+
+ ShmemRequestStruct(&BackendStatusArrayShmemDesc,
+ .name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(&BackendAppnameBufferShmemDesc,
+ .name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(&BackendClientHostnameBufferShmemDesc,
+ .name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(&BackendActivityBufferSizeShmemDesc,
+ .name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
+#ifdef USE_SSL
+ ShmemRequestStruct(&BackendSslStatusBufferShmemDesc,
+ .name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
+#endif
+
+#ifdef ENABLE_GSS
+ ShmemRequestStruct(&BackendGssStatusBufferShmemDesc,
+ .name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
- }
-
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
- {
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +194,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +208,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..602ec4978c6 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -150,101 +158,100 @@ StatsShmemSize(void)
}
/*
- * Initialize cumulative statistics system during startup
+ * Register shared memory area for cumulative statistics
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemRequest(void *arg)
{
- bool found;
- Size sz;
-
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ static ShmemStructDesc StatsShmemDesc;
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ ShmemRequestStruct(&StatsShmemDesc,
+ .name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
- Assert(!found);
+/*
+ * Initialize cumulative statistics system during startup
+ */
+static void
+StatsShmemInit(void *arg)
+{
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- if (!kind_info)
- continue;
+ if (!kind_info)
+ continue;
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
+ {
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 2b76967776c..eb264634217 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -95,59 +96,54 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ static ShmemStructDesc WaitEventCustomCounterShmemDesc;
+ static ShmemHashDesc WaitEventCustomHashByInfoDesc;
+ static ShmemHashDesc WaitEventCustomHashByNameDesc;
+
+ ShmemRequestStruct(&WaitEventCustomCounterShmemDesc,
+ .name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(&WaitEventCustomHashByInfoDesc,
+ .name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(&WaitEventCustomHashByNameDesc,
+ .name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..9981d6e212f 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,46 +232,38 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ static ShmemStructDesc InjectionPointShmemDesc;
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+ ShmemRequestStruct(&InjectionPointShmemDesc,
+ .name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
}
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
+}
+#endif
+
/*
* Attach a new injection point.
*/
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4af38e74ce4..437b4f32349 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,8 +259,6 @@ extern void InitLocalDataChecksumState(void);
extern void SetLocalDataChecksumState(uint32 data_checksum_version);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/datachecksum_state.h b/src/include/postmaster/datachecksum_state.h
index 343494edcc8..05625539604 100644
--- a/src/include/postmaster/datachecksum_state.h
+++ b/src/include/postmaster/datachecksum_state.h
@@ -17,10 +17,6 @@
#include "storage/procsignal.h"
-/* Shared memory */
-extern Size DataChecksumsShmemSize(void);
-extern void DataChecksumsShmemInit(void);
-
/* Possible operations the Datachecksumsworker can perform */
typedef enum DataChecksumsWorkerOperation
{
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d8e11756a61..5e092552c72 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,9 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
@@ -40,12 +43,18 @@ PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
+
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -53,9 +62,27 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(DataChecksumsShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..592ecfad3da 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,26 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc InjectionPointsShmemDesc;
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(&InjectionPointsShmemDesc,
+ .name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +587,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index 34487a05486..8fa51a7dd02 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,78 +74,73 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
+ static ShmemStructDesc inj_io_shmem_desc;
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(&inj_io_shmem_desc,
+ .name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -154,10 +149,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 12:00 Matthias van de Meent <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-04 12:00 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <[email protected]> wrote:
>
> On 03/04/2026 16:12, Ashutosh Bapat wrote:
> > On Fri, Apr 3, 2026 at 3:40 AM Matthias van de Meent
> > <[email protected]> wrote:
> >> While I do think it's an improvement over the current APIs, the
> >> improvement seems to be mostly concentrated in the RequestStruct/Hash
> >> department, with only marginal improvements in RegisterShmemCallbacks.
> >> I feel like it's missing the important part: I'd like
> >> direct-from-_PG_init() ShmemRequestStruct/Hash calls. If
> >> ShmemRequestStruct/Hash had a size callback as alternative to the size
> >> field (which would then be called after preload_libraries finishes)
> >> then that would be sufficient for most shmem allocations, and it'd
> >> simplify shmem management for most subsystems.
> >> We'd still need the shmem lifecycle hooks/RegisterShmemCallbacks to
> >> allow conditionally allocated shmem areas (e.g. those used in aio),
> >> but I think that, in general, we shouldn't need a separate callback
> >> function just to get started registering shmem structures.
> >>
> >> I also noticed that ShmemCallbacks.%_arg are generally undocumented,
> >> and I couldn't find any users in core (at the end of the patchset)
> >> that actually use the argument. Could it be I missed something?
>
> None of the current code currently uses it, that's correct. I felt it
> might become very handy in the future or in extensions, if you wanted to
> reuse the same function for initializing different shmem areas, for
> example.
That's cool, but if that common initialization path is common enough
to need special coding, then how come that this patch make PG use it?
I can think of many systems that "just" initialize a hash table or
"just" allocate a shmem area.
> It's a pretty common pattern to have an opaque pointer like
> that in any callbacks.
I agree that it's a rather common pattern, but from an OOP
perspective, shouldn't the argument be the ShmemCallbacks*? Users can
embed the struct to extend the data carried if they need it to.
> >> I don't understand the use of ShmemStructDesc. They generally/always
> >> are private to request_fn(), and their fields are used exclusively
> >> inside the shmem mechanisms, with no reads of its fields that can't
> >> already be deduced from context. Why do we need that struct
> >> everywhere?
> >
> > My resizable shared memory structure patches use it as a handle to the
> > structure to be resized.
>
> Right. And hash tables and SLRUs use a desc-like object already, so for
> symmetry it feels natural to have it for plain structs too.
> I wonder if we should make it optional though, for the common case that
> you have no intention of doing anything more with the shmem region that
> you'd need a desc for. I'm thinking you could just pass NULL for the
> desc pointer:
>
> ShmemRequestStruct(NULL,
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> };
That would help, though I'd still wonder why we'd have separate Opts
and Desc structs. IIUC, they generally carry (exactly) the same data.
Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
make that part of the code a bit cleaner; as it'd further clarify that
it's very much an optional field.
I'll check out your latest version in a bit.
Kind regards,
Matthias van de Meent
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 13:51 Matthias van de Meent <[email protected]>
parent: Heikki Linnakangas <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-04 13:51 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, 4 Apr 2026 at 02:49, Heikki Linnakangas <[email protected]> wrote:
> Those are now committed, and here's a new version rebased over those
> changes. The hash options is now called 'nelems', and the 'extra_size'
> in ShmemStructOpts is gone.
>
> Plus a bunch of other fixes and cleanups. I also reordered and
> re-grouped the patches a little, into more logical increments I hope.
0001: LGTM
0002:
> +++ b/src/backend/storage/ipc/shmem.c
> + * Nowadays, there is also a third way to allocate shared memory called
There's no clear indicator of the second way to allocate shared
memory, nor is the first one clearly defined in the new verson of the
comment block.
> + * item is deleted. However, if one hash table grows very large and then
> + * shrinks, its space cannot be redistributed to other tables. We could build
> + * a simple hash bucket garbage collector if need be. Right now, it seems
> + * unnecessary.
I think this new text is outdated, given that we don't have growing
hash tables anymore.
I also think it should've referred to elements, not buckets;
dynahash's buckets cannot readily be deallocated as they're generally
always "in use" (they might be NULL, but they're still accessed in
read operations on missing keys). Elements are put in the freelist if
not used, and those could be released into a memory pool if so desired
(and coded).
> + * In builtin PostgreSQL code, add the callbacks to the list in
> + * src/include/storage/subsystemlist.h.
This refers to an automation system that's introduced a few commits
later, in commit 0005, and therefore probably should be added only in
that commit.
> + * Legacy ShmemInitStruct()/ShmemInitHash() functions
> + * --------------------------------------------------
Should we have checks in place to avoid calls to new APIs from old
callbacks, and vice versa?
> ShmemRequestInternal(...
> + ShmemRequest *request;
[...]
> + foreach_ptr(ShmemStructDesc, existing, pending_shmem_requests)
[...]
> + request = palloc(sizeof(ShmemRequest));
[...]
> + pending_shmem_requests = lappend(pending_shmem_requests, request);
It looks like you missed my earlier comment about type confusion.
Here, pending_shmem_requests is a List of ShmemRequest pointers, while
the foreach_ptr() uses ShmemStructDesc, which is a type confusion. The
loop checks the 'char *name' field of ShmemStructDesc, which in a
ShmemRequest is the 'ShmemStructDesc *desc'. This bug would cause
issues if different ShmemStructDescs are registered by the same name,
as the ShmemStructDescs wouldn't (necessarily) be strcmp()-equal for
the same name.
> ShmemAttachRequested(void)
> + /* Call attach callbacks */
> + foreach(lc, registered_shmem_callbacks)
> + {
> + const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
This would be more concise with foreach_ptr(const ShmemCallbacks,
callbacks, registered_shmem_callbacks), like in ShmemInitRequested.
> +++ b/src/include/storage/shmem.h
> +/*
> + * Shared memory is reserved and allocated in stages at postmaster startup,
> + * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
> + * at backend startup. ShmemCallbacks holds callback functions that are
> + * called at different stages.
> + */
> +typedef struct ShmemCallbacks
Maybe this should also have the opportunity for a (before_)shmem_exit callback?
> + * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
> + * called immediately after registration, to initialize or attach to the
> + * requested shared memory areas.
Ideally we only immediately call the callbacks if we're under
postmaster, or in a standalone backend; we shouldn't allocate shmem
for some preloaded libraries that set this flag, at least not ahead of
loading all preload libraries.
0003: Maybe this could also test that the protections we're putting in
place against double-registration of shmem areas actually detect the
duplication issue?
Otherwise, LGTM
0004-0014: TBD
While it's mostly mechanical changes, it did make me notice the rather
annoying allocation patterns by XLOGShmemRequest. It allocates various
types of data in one go (which, in principle, is fine) but in doing so
it adds its own alignment tricks etc, and I'm not super stoked about
that. If time allows, could we clean that up?
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 16:32 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-04 16:32 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, Apr 4, 2026 at 6:19 AM Heikki Linnakangas <[email protected]> wrote:
>
> On 02/04/2026 09:58, Ashutosh Bapat wrote:
> > On Wed, Apr 1, 2026 at 11:47 PM Heikki Linnakangas <[email protected]> wrote:
> >>> + /*
> >>> + * Extra space to reserve in the shared memory segment, but it's not part
> >>> + * of the struct itself. This is used for shared memory hash tables that
> >>> + * can grow beyond the initial size when more buckets are allocated.
> >>> + */
> >>> + size_t extra_size;
> >>>
> >>> When we introduce resizable structures (where even the hash table
> >>> directly itself could be resizable), we will introduce a new field
> >>> max_size which is easy to get confused with extra_size. Maybe we can
> >>> rename extra_size to something like "auxilliary_size" to mean size of
> >>> the auxiliary parts of the structure which are not part of the main
> >>> struct itself.
> >>>
> >>> + /*
> >>> + * max_size is the estimated maximum number of hashtable entries. This is
> >>> + * not a hard limit, but the access efficiency will degrade if it is
> >>> + * exceeded substantially (since it's used to compute directory size and
> >>> + * the hash table buckets will get overfull).
> >>> + */
> >>> + size_t max_size;
> >>> +
> >>> + /*
> >>> + * init_size is the number of hashtable entries to preallocate. For a
> >>> + * table whose maximum size is certain, this should be equal to max_size;
> >>> + * that ensures that no run-time out-of-shared-memory failures can occur.
> >>> + */
> >>> + size_t init_size;
> >>>
> >>> Everytime I look at these two fields, I question whether those are the
> >>> number of entries (i.e. size of the hash table) or number of bytes
> >>> (size of the memory). I know it's the former, but it indicates that
> >>> something needs to be changed here, like changing the names to have
> >>> _entries instead of _size, or changing the type to int64 or some such.
> >>> Renaming to _entries would conflict with dynahash APIs since they use
> >>> _size, so maybe the latter?
> >>
> >> I hear you, but I didn't change these yet. If we go with the patches
> >> from the "Shared hash table allocations" thread, max_size and init_size
> >> will be merged into one. I'll try to settle that thread before making
> >> changes here.
> >
> > Will review those patches next.
>
> Those are now committed, and here's a new version rebased over those
> changes. The hash options is now called 'nelems', and the 'extra_size'
> in ShmemStructOpts is gone.
>
Thanks. Adjusted my resizable shared memory patch on top of this. The
result looks better.
> Plus a bunch of other fixes and cleanups. I also reordered and
> re-grouped the patches a little, into more logical increments I hope.
Some more comments
test_shmem declares MODULE_big and OBJS which seems to be old
fashioned, newer modules seem to be using MODULES. Also it should use
NO_INSTALLCHECK.
/*
* Alignment of the starting address. If not set, defaults to cacheline
* boundary. Must be a power of two.
*/
size_t alignment;
We don't seem to enforce the "must be a power of two" rule anywhere.
We should at least validate it.
I like the way buffer manager related changes untangle sub-sub-systems
of Buffer manager viz. StrategyControl and buffer look up table.
Simplifies code very much.
I also eyeballed some of the changes in 0014. If time permits, I will
review those closely soon. But the changes look ok.
Before this change, replication_states_ctl in origin.c was not
initialized explicitly when max_active_replication_origins = 0. With
this change, the structure is not registered and thus global static
pointer is not initialized. However, given that it's implicit, I
suggest adding Asserts as attached.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] 0014_edits.diff.nocibot (696B, 2-0014_edits.diff.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 17:32 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-04 17:32 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 04/04/2026 15:00, Matthias van de Meent wrote:
> On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <[email protected]> wrote:
>>>> I don't understand the use of ShmemStructDesc. They generally/always
>>>> are private to request_fn(), and their fields are used exclusively
>>>> inside the shmem mechanisms, with no reads of its fields that can't
>>>> already be deduced from context. Why do we need that struct
>>>> everywhere?
>>>
>>> My resizable shared memory structure patches use it as a handle to the
>>> structure to be resized.
>>
>> Right. And hash tables and SLRUs use a desc-like object already, so for
>> symmetry it feels natural to have it for plain structs too.
>> I wonder if we should make it optional though, for the common case that
>> you have no intention of doing anything more with the shmem region that
>> you'd need a desc for. I'm thinking you could just pass NULL for the
>> desc pointer:
>>
>> ShmemRequestStruct(NULL,
>> .name = "pg_stat_statements",
>> .size = sizeof(pgssSharedState),
>> .ptr = (void **) &pgss,
>> };
>
> That would help, though I'd still wonder why we'd have separate Opts
> and Desc structs. IIUC, they generally carry (exactly) the same data.
>
> Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
> make that part of the code a bit cleaner; as it'd further clarify that
> it's very much an optional field.
Yeah. OTOH, I'd like to separate the options from what's effectively a
return value. But maybe you're right and it's nevertheless better that way.
Some options on this:
a) What's in the patch now
static ShmemStructDesc pgssSharedStateDesc;
ShmemRequestStruct(&pgssSharedStateDesc,
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss);
b) Allow passing NULL for the desc
ShmemRequestStruct(NULL,
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss);
c) Return the Desc as a return value
static ShmemStructDesc *pgssSharedStateDesc;
pgssSharedStateDesc =
ShmemRequestStruct(.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss);
In option c) you can just throw away the result if you don't need it. I
kind of like this as a notational thing. However it has some downsides:
This changes the return value to be a pointer. I'm thinking that
ShmemRequestStruct() palloc's the descriptor struct in TopMemoryContext.
This is a little ugly because the descriptor struct is leaked if the
caller throws it away. It's not a lot of memory, but still.
I'm also not sure how well this fits in with the SLRU code. On 'master',
you already have SlruCtlData which is like the "desc" struct. Would we
turn that into a pointer too, adding one indirection to all the SLRU
calls. It's probably fine from a performance point of view, but it feels
like it's going in the wrong direction.
d) Make it part of Opts, as you suggested
static ShmemStructDesc pgssSharedStateDesc;
ShmemRequestStruct(.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
.desc = &pgssSharedStateDesc);
In the attached new version, though, I stepped back and decided to
remove the whole ShmemStructDesc after all. I still think having a
handle like that is a good idea, and the follow-up patches for resizing
need it. However, with option d) it can easily be added later. With
option d), it seems silly to have it be part of the patch now, when the
desc struct doesn't really do anything. SLRU's still have a similar
SlruDesc struct, however. For SLRUs it's essentially the same as the old
SlruCtlData struct before these patches.
The Desc structs were being used for one thing though: I used the 'size'
from the Desc struct in ProcGlobalShmemInit() to get the allocated size
of each shmem area. The size computation there is complicated enough
that I'd rather not repeat it, and avoiding the repeated size
calculation was the raison d'être for these patches. I replaced it with
global variables to hold the sizes from the ShmemRequest() step to
ShmemInit(). But that would be one case where having the desc would
already be useful. Then again, I'm not sure we want to expose the 'size'
in the descriptor like that anyway, because as soon as we make shmem
regions resizable, we might not be able to keep the size in the
descriptor up-to-date. The size of these structs won't change, but we
might not want to expose the information because it would be confusing
for other structs where it can change to show outdated information.
On a related note, when we add back the ".desc" concept later, is
".desc" a good name, or ".handle" as you also suggested? More widely, do
we call the concept and the struct a "handle" or "descriptor" or what?
Or if we follow the precedence with the existing SlruCtlData struct, it
could be ".ctl". I'm not a fan of the "Ctl" naming though, because we
already have a lot of structs with "Ctl" in the name and it's not always
clear whether a "Ctl" struct refers to the shared memory parts or the
handle to it. Now that the "desc" structs are not part of these patches
anymore, however, we can punt on that decision.
On 02/04/2026 09:58, Ashutosh Bapat wrote:
>>
>> I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init'
>> and 'may_attach'. But more importantly I added comments to explain the
>> different usages. Hope that helps..
>
> The explanation in the prologue looks good. But the function is still
> confusing. Instead of if ... else fi ... chain, I feel organizing this
> as below would make it more readable. (this was part of one of my
> earlier edit patches).
> if (found)
> ...
> else
> {
> if (!may_init)
> error
> if (!index_entry)
> error
>
> ... rest of the code to initialize and attach
> }
>
> But other than that I don't have any other brilliant ideas.
I did another refactoring in this area: I split
AttachOrInitShmemIndexEntry() into separate AttachShmemIndexEntry() and
InitShmemIndexEntry functions again. There's a little bit of repetition
that way, but IMO it makes it much clearer overall.
Other changes in this patch version:
- I moved some of the stuff from shmem.h to a new shmem_internal.h
header. The idea is that what remains in shmem.h provides the public API
for allocating shared memory.
- I refactored the "after-startup request" code. It now detects the case
that some of the shmem areas, but not all, have already been initialized
and throws an error.
Still processing the rest of the feedback from the past days. This patch
version is also available at
https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-11.
- Heikki
Attachments:
[text/x-patch] v11-0001-refactor-Move-ShmemInitHash-to-separate-file.patch (11.0K, 2-v11-0001-refactor-Move-ShmemInitHash-to-separate-file.patch)
download | inline diff:
From 2b55f009f260565d285f149bd79fa71736d0bef4 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH v11 01/14] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 108 ----------------------
src/backend/storage/ipc/shmem_hash.c | 130 +++++++++++++++++++++++++++
src/include/storage/shmem.h | 9 +-
5 files changed, 139 insertions(+), 110 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cb51ad62f8..c994f7674ec 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -96,9 +96,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -257,29 +254,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/*
- * ShmemHashAlloc -- alloc callback for shared memory hash tables
- *
- * Carve out the allocation from a pre-allocated region. All shared memory
- * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
- * happen upfront during initialization and no locking is required.
- */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
- void *result;
-
- size = MAXALIGN(size);
-
- if (allocator->end - allocator->next < size)
- return NULL;
- result = allocator->next;
- allocator->next += size;
-
- return result;
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -341,88 +315,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * nelems is the maximum number of hashtable entries.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 nelems, /* size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- size_t size;
- void *location;
-
- size = hash_estimate_size(nelems, infoP->entrysize);
-
- /* look it up in the shmem index or allocate */
- location = ShmemInitStruct(name, size, &found);
-
- return shmem_hash_create(location, size, found,
- name, nelems, infoP, hash_flags);
-}
-
-/*
- * Initialize or attach to a shared hash table in the given shmem region.
- *
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
- */
-static HTAB *
-shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-{
- shmem_hash_allocator allocator;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory and have
- * all elements allocated upfront. We don't support growing because we'd
- * need to grow the underlying shmem region with it.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (!found)
- {
- allocator.next = (char *) location;
- allocator.end = (char *) location + size;
- infoP->alloc_arg = &allocator;
- }
- else
- {
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
- hash_flags |= HASH_ATTACH;
- }
-
- return hash_create(name, nelems, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..0b05730129e
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,130 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+static void *ShmemHashAlloc(Size size, void *alloc_arg);
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * nelems is the maximum number of hashtable entries.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 nelems, /* size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ size_t size;
+ void *location;
+
+ size = hash_estimate_size(nelems, infoP->entrysize);
+
+ /* look it up in the shmem index or allocate */
+ location = ShmemInitStruct(name, size, &found);
+
+ return shmem_hash_create(location, size, found,
+ name, nelems, infoP, hash_flags);
+}
+
+/*
+ * Initialize or attach to a shared hash table in the given shmem region.
+ *
+ * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
+ * share the logic for bootstrapping the ShmemIndex hash table.
+ */
+HTAB *
+shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+{
+ shmem_hash_allocator allocator;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory and have
+ * all elements allocated upfront. We don't support growing because we'd
+ * need to grow the underlying shmem region with it.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (!found)
+ {
+ allocator.next = (char *) location;
+ allocator.end = (char *) location + size;
+ infoP->alloc_arg = &allocator;
+ }
+ else
+ {
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+ hash_flags |= HASH_ATTACH;
+ }
+
+ return hash_create(name, nelems, infoP, hash_flags);
+}
+
+/*
+ * ShmemHashAlloc -- alloc callback for shared memory hash tables
+ *
+ * Carve out the allocation from a pre-allocated region. All shared memory
+ * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
+ * happen upfront during initialization and no locking is required.
+ */
+static void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
+ void *result;
+
+ size = MAXALIGN(size);
+
+ if (allocator->end - allocator->next < size)
+ return NULL;
+ result = allocator->next;
+ allocator->next += size;
+
+ return result;
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index a2eb499d63c..82f5403c952 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -30,15 +30,20 @@ typedef struct PGShmemHeader PGShmemHeader; /* avoid including
extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
+extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+/* shmem_hash.c */
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
--
2.47.3
[text/x-patch] v11-0002-Introduce-a-new-mechanism-for-registering-shared.patch (62.3K, 3-v11-0002-Introduce-a-new-mechanism-for-registering-shared.patch)
download | inline diff:
From 9672e1061be34df3fd190b25cc0a15b806af0046 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:01:39 +0300
Subject: [PATCH v11 02/14] Introduce a new mechanism for registering shared
memory areas
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register or "request" the subsystem's shared memory
needs. This is more ergonomic, as you only need to calculate the size
once.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 5 +
src/backend/postmaster/postmaster.c | 19 +-
src/backend/storage/ipc/ipci.c | 30 +-
src/backend/storage/ipc/shmem.c | 832 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 86 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 10 +-
src/include/storage/shmem.h | 183 +++++-
src/include/storage/shmem_internal.h | 52 ++
src/tools/pgindent/typedefs.list | 9 +-
13 files changed, 1190 insertions(+), 207 deletions(-)
create mode 100644 src/include/storage/shmem_internal.h
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c52c0a6023d..c707ccfa563 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -373,6 +374,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..0973010b7dc 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,7 +49,9 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +674,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eb4f3eb72d4..693475014fe 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -115,6 +115,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "tcop/tcopprot.h"
#include "utils/datetime.h"
@@ -951,7 +952,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3232,7 +3240,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7aab5da3386..24422a80ab3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -100,8 +101,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -176,6 +178,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -220,7 +229,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -251,11 +270,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c994f7674ec..29ff6065dda 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,43 +19,115 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size and
- * cannot grow beyond that. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -70,10 +142,80 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructOpts *options;
+ ShmemRequestKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -105,35 +247,379 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (128)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
+static void InitShmemIndexEntry(ShmemRequest *request);
+static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
/*
- * A very simple allocator used to carve out different parts of a hash table
- * from a previously allocated contiguous shared memory area.
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemRequest, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->options->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+ size = CACHELINEALIGN(size);
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ /* calculate alignment padding like ShmemAllocRaw() does */
+ size = CACHELINEALIGN(size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ InitShmemIndexEntry(request);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachShmemIndexEntry(request, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Insert requested shmem area into the shared memory index and initialize it.
+ *
+ * Note that this only does performs basic initialization depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific init callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
*/
-typedef struct shmem_hash_allocator
+static void
+InitShmemIndexEntry(ShmemRequest *request)
{
- char *next; /* start of free space in the area */
- char *end; /* end of the shmem area */
-} shmem_hash_allocator;
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+ bool found;
+ size_t allocated_size;
+ void *structPtr;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_ENTER_NULL, &found);
+ if (found)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", name);
+ if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ name)));
+ }
+
+ /*
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of shared memory for it, and initialize the index entry.
+ */
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(structPtr, request->options);
+ break;
+ }
+}
+
+/*
+ * Look up a named shmem area in the shared memory index and attach to it.
+ *
+ * Note that this only performs the basic attachment actions depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific attach callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
+ */
+static bool
+AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
+{
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, NULL);
+ if (!index_entry)
+ {
+ if (!missing_ok)
+ ereport(ERROR,
+ (errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ request->options->name)));
+ return false;
+ }
+
+ /* Check that the size in the index matches the request. */
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different size: existing %zu, requested %zu",
+ name, index_entry->size, request->options->size)));
+ }
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(index_entry->location, request->options);
+ break;
+ }
+
+ return true;
+}
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_nelems;
HASHCTL info;
int hash_flags;
@@ -142,6 +628,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -186,19 +682,21 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_nelems = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_FIXED_SIZE;
if (!IsUnderPostmaster)
{
- ShmemAllocator->index_size = hash_estimate_size(SHMEM_INDEX_SIZE, info.entrysize);
+ ShmemAllocator->index_size = hash_estimate_size(hash_nelems, info.entrysize);
ShmemAllocator->index = (HASHHDR *) ShmemAlloc(ShmemAllocator->index_size);
}
ShmemIndex = shmem_hash_create(ShmemAllocator->index,
ShmemAllocator->index_size,
IsUnderPostmaster,
- "ShmemIndex", SHMEM_INDEX_SIZE,
+ "ShmemIndex", hash_nelems,
&info, hash_flags);
Assert(ShmemIndex != NULL);
@@ -219,6 +717,23 @@ InitShmemAllocator(PGShmemHeader *seghdr)
}
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -316,92 +831,191 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /*
+ * After-startup initialization or attachment. Call the appropriate
+ * callbacks immmediately.
+ */
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- Assert(ShmemIndex != NULL);
+ CallShmemCallbacksAfterStartup(callbacks);
+ }
+ else
+ {
+ /* Remember the callbacks for later */
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+ }
+}
+
+/*
+ * Register a shmem area (or multiple areas) after startup.
+ */
+static void
+CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
+{
+ bool found_any;
+ bool notfound_any;
+
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
+
+ /*
+ * Call the request callback first. The callback make ShmemRequest*()
+ * calls for each shmem area, adding them to pending_shmem_requests.
+ */
+ Assert(pending_shmem_requests == NIL);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
+
+ if (pending_shmem_requests == NIL)
+ {
+ shmem_request_state = SRS_DONE;
+ return;
+ }
+ /* Hold ShmemIndexLock while we allocate all the shmem entries */
LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ /*
+ * Check if the requested shared memory areas have already been
+ * initialized. We assume all the areas requested by the request callback
+ * to form a coherent unit such that they're all already initialized or
+ * none. Otherwise it would be ambiguous which callback, init or attach,
+ * to callback afterwards.
+ */
+ found_any = notfound_any = false;
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ if (hash_search(ShmemIndex, request->options->name, HASH_FIND, NULL))
+ found_any = true;
+ else
+ notfound_any = true;
+ }
+ if (found_any && notfound_any)
+ elog(ERROR, "found some but not all");
- if (!result)
+ /*
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
+ if (found_any)
+ AttachShmemIndexEntry(request, false);
+ else
+ InitShmemIndexEntry(request);
}
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- if (*foundPtr)
+ /* Finish by calling the appropriate subsystem-specific callback */
+ if (found_any)
{
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
}
else
{
- Size allocated_size;
-
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+}
- Assert(ShmemAddrIsValid(structPtr));
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
+
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ void *ptr = NULL;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ .ptr = &ptr,
+ };
+ ShmemRequest request = {&options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /*
+ * During postmaster startup, look up the existing entry if any.
+ */
+ *foundPtr = false;
+ if (IsUnderPostmaster)
+ *foundPtr = AttachShmemIndexEntry(&request, true);
+
+ /* Initialize it if not found */
+ if (!*foundPtr)
+ InitShmemIndexEntry(&request);
+
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(ptr != NULL);
+ return ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 0b05730129e..ab30461f247 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -21,9 +21,81 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
+#include "utils/memutils.h"
+
+/*
+ * A very simple allocator used to carve out different parts of a hash table
+ * from a previously allocated contiguous shared memory area.
+ */
+typedef struct shmem_hash_allocator
+{
+ char *next; /* start of free space in the area */
+ char *end; /* end of the shmem area */
+} shmem_hash_allocator;
static void *ShmemHashAlloc(Size size, void *alloc_arg);
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_estimate_size(options_copy->nelems,
+ options_copy->hash_info.entrysize);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ options->hash_info.hctl = location;
+ htab = shmem_hash_create(location, options->base.size, false,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
+void
+shmem_hash_attach(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = location;
+ Assert(options->hash_info.hctl != NULL);
+ htab = shmem_hash_create(location, options->base.size, true,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -40,9 +112,8 @@ static void *ShmemHashAlloc(Size size, void *alloc_arg);
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
@@ -56,7 +127,14 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
size = hash_estimate_size(nelems, infoP->entrysize);
- /* look it up in the shmem index or allocate */
+ /*
+ * Look it up in the shmem index or allocate.
+ *
+ * NOTE: The area is requested internally as SHMEM_KIND_STRUCT instead of
+ * SHMEM_KIND_HASH. That's correct because we do the hash table
+ * initialization by calling shmem_hash_create() ourselves. (We don't
+ * expose the request kind to users; if we did, that would be confusing.)
+ */
location = ShmemInitStruct(name, size, &found);
return shmem_hash_create(location, size, found,
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..93851269e43 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -67,6 +67,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinval.h"
#include "storage/standby.h"
#include "tcop/backend_startup.h"
@@ -4155,7 +4156,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 82f5403c952..147a6915f7e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -3,6 +3,11 @@
* shmem.h
* shared memory management structures
*
+ * This file contains public functions for other core subsystems and
+ * extensions to allocate shared memory. Internal functions for the shmem
+ * allocator itself and hooking it to the rest of the system are in
+ * shmem_internal.h
+ *
* Historical note:
* A long time ago, Postgres' shared memory region was allowed to be mapped
* at a different address in each process, and shared memory "pointers" were
@@ -23,43 +28,165 @@
#include "utils/hsearch.h"
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
-/* shmem.c */
-typedef struct PGShmemHeader PGShmemHeader; /* avoid including
- * storage/pg_shmem.h here */
-extern void InitShmemAllocator(PGShmemHeader *seghdr);
-extern void *ShmemAlloc(Size size);
-extern void *ShmemAllocNoError(Size size);
-extern void *ShmemHashAlloc(Size size, void *alloc_arg);
+ /*
+ * Requested size of the shmem allocation.
+ *
+ * When attaching to an existing allocation, the size must match the size
+ * given when the shmem region was allocated. This cross-check can be
+ * disabled specifying SHMEM_ATTACH_UNKNOWN_SIZE.
+ */
+ ssize_t size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * 'nelems' is the max number of elements for the hash table.
+ */
+ int64 nelems;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * ALLOW_AFTER_STARTUP: Allow these shared memory usages to be registered
+ * after postmaster startup. Normally, registering a shared memory system
+ * after postmaster startup is not allowed e.g. in an add-in library loaded
+ * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
+ * called immediately after registration, to initialize or attach to the
+ * requested shared memory areas. This is not used by any built-in
+ * subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(...) \
+ ShmemRequestStructWithOpts(&(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(...) \
+ ShmemRequestHashWithOpts(&(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(const ShmemHashOpts *options);
+
+/* legacy shmem allocation functions */
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern void *ShmemAlloc(Size size);
+extern void *ShmemAllocNoError(Size size);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
-/* shmem_hash.c */
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
-extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
new file mode 100644
index 00000000000..fe12bf33439
--- /dev/null
+++ b/src/include/storage/shmem_internal.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_internal.h
+ * Internal functions related to shmem allocation
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/shmem_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SHMEM_INTERNAL_H
+#define SHMEM_INTERNAL_H
+
+#include "storage/shmem.h"
+#include "utils/hsearch.h"
+
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemRequestKind;
+
+/* shmem.c */
+typedef struct PGShmemHeader PGShmemHeader; /* avoid including
+ * storage/pg_shmem.h here */
+extern void ShmemCallRequestCallbacks(void);
+extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
+extern void ResetShmemAllocator(void);
+
+extern void ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
+extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+
+/* shmem_hash.c */
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void shmem_hash_init(void *location, ShmemStructOpts *options);
+extern void shmem_hash_attach(void *location, ShmemStructOpts *options);
+
+#endif /* SHMEM_INTERNAL_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c72f6c59573..b84167741fb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2863,9 +2863,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemRequestKind
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v11-0003-Add-test-module-to-test-after-startup-shmem-allo.patch (10.1K, 4-v11-0003-Add-test-module-to-test-after-startup-shmem-allo.patch)
download | inline diff:
From d7b0bdd6bea52854d43ee6f83ec550810acfbb9d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:10:31 +0300
Subject: [PATCH v11 03/14] Add test module to test after-startup shmem
allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 +++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 +++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 101 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 222 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..f1b04c99969 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -48,6 +48,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..fc99552d9ab 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -49,6 +49,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..9bd4012b435
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,101 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(.name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem);
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b84167741fb..63c0b3a9465 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.47.3
[text/x-patch] v11-0004-Convert-pg_stat_statements-to-use-the-new-interf.patch (11.2K, 5-v11-0004-Convert-pg_stat_statements-to-use-the-new-interf.patch)
download | inline diff:
From 9e644e0d0834c8bf731ebbaa21fac7f9c8bdbcb1 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:24 +0300
Subject: [PATCH v11 04/14] Convert pg_stat_statements to use the new interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 173 ++++++++----------
1 file changed, 77 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5494d41dca1..f078b4fe71b 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,42 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ ShmemRequestHash(.name = "pg_stat_statements hash",
+ .nelems = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ .ptr = &pgss_hash,
+ );
+ ShmemRequestStruct(.name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +544,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1334,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1355,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1374,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1514,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1803,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2041,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2081,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2711,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2805,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v11-0005-Introduce-registry-of-built-in-subsystems.patch (7.3K, 6-v11-0005-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From de47dea1057e94d69f3c7adfdbee9f5e8d2a1417 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:02 +0300
Subject: [PATCH v11 05/14] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
src/tools/pginclude/headerscheck | 1 +
9 files changed, 88 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c707ccfa563..49d88a1b6dd 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -363,6 +363,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 0973010b7dc..ed0f4f2d234 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -664,6 +664,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 693475014fe..b2010bce186 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -922,6 +922,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 24422a80ab3..e4a6a52f12d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -252,6 +253,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 93851269e43..6a9ff3ad225 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4138,6 +4138,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 14c466cc237..24f7416185e 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -131,6 +131,7 @@ do
test "$f" = src/include/postmaster/proctypelist.h && continue
test "$f" = src/include/regex/regerrs.h && continue
test "$f" = src/include/storage/lwlocklist.h && continue
+ test "$f" = src/include/storage/subsystemlist.h && continue
test "$f" = src/include/tcop/cmdtaglist.h && continue
test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue
test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue
--
2.47.3
[text/x-patch] v11-0006-Convert-lwlock.c-to-use-the-new-interface.patch (6.4K, 7-v11-0006-Convert-lwlock.c-to-use-the-new-interface.patch)
download | inline diff:
From d6dbea7e346f4febf01db204c03c28235d77576e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:18:05 +0300
Subject: [PATCH v11 06/14] Convert lwlock.c to use the new interface
It seems like a good candidate to convert first because it needs to
initialized before any other subsystem, but other than that it's
nothing special.
---
src/backend/storage/ipc/ipci.c | 13 ------
src/backend/storage/lmgr/lwlock.c | 71 +++++++++++++++--------------
src/include/storage/lwlock.h | 2 -
src/include/storage/subsystemlist.h | 9 +++-
4 files changed, 45 insertions(+), 50 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index e4a6a52f12d..de65a9ef33c 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -121,7 +121,6 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, LWLockShmemSize());
size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, SharedInvalShmemSize());
@@ -179,11 +178,6 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
- /*
- * Attach to LWLocks first. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
CreateOrAttachShmemStructs();
@@ -230,13 +224,6 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /*
- * Initialize LWLocks first, in case any of the shmem init function use
- * LWLocks. (Nothing else can be running during startup, so they don't
- * need to do any locking yet, but we nevertheless allow it.)
- */
- LWLockShmemInit();
-
/* Initialize all shmem areas */
ShmemInitRequested();
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 5cb696490d6..30b715ab051 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -84,6 +84,7 @@
#include "storage/proclist.h"
#include "storage/procnumber.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -212,6 +213,15 @@ typedef struct NamedLWLockTrancheRequest
static List *NamedLWLockTrancheRequests = NIL;
+static void LWLockShmemRequest(void *arg);
+static void LWLockShmemInit(void *arg);
+
+const ShmemCallbacks LWLockCallbacks = {
+ .request_fn = LWLockShmemRequest,
+ .init_fn = LWLockShmemInit,
+};
+
+
static void InitializeLWLocks(int numLocks);
static inline void LWLockReportWaitStart(LWLock *lock);
static inline void LWLockReportWaitEnd(void);
@@ -401,58 +411,51 @@ NumLWLocksForNamedTranches(void)
}
/*
- * Compute shmem space needed for user-defined tranches and the main LWLock
- * array.
+ * Request shmem space for user-defined tranches and the main LWLock array.
*/
-Size
-LWLockShmemSize(void)
+static void
+LWLockShmemRequest(void *arg)
{
- Size size;
int numLocks;
+ Size size;
+
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
/* Space for user-defined tranches */
size = sizeof(LWLockTrancheShmemData);
-
- /* Space for the LWLock array */
- numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
size = add_size(size, mul_size(numLocks, sizeof(LWLockPadded)));
+ ShmemRequestStruct(.name = "LWLock tranches",
+ .size = size,
+ .ptr = (void **) &LWLockTranches,
+ );
- return size;
+ /* Space for the LWLock array */
+ ShmemRequestStruct(.name = "Main LWLock array",
+ .size = numLocks * sizeof(LWLockPadded),
+ .ptr = (void **) &MainLWLockArray,
+ );
}
/*
- * Allocate shmem space for user-defined tranches and the main LWLock array,
- * and initialize it.
+ * Initialize shmem space for user-defined tranches and the main LWLock array.
*/
-void
-LWLockShmemInit(void)
+static void
+LWLockShmemInit(void *arg)
{
int numLocks;
- bool found;
- LWLockTranches = (LWLockTrancheShmemData *)
- ShmemInitStruct("LWLock tranches", sizeof(LWLockTrancheShmemData), &found);
- if (!found)
- {
- /* Calculate total number of locks needed in the main array */
- LWLockTranches->num_main_array_locks =
- NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
- /* Initialize the dynamic-allocation counter for tranches */
- LWLockTranches->num_user_defined = 0;
+ /* Remember total number of locks needed in the main array */
+ LWLockTranches->num_main_array_locks = numLocks;
- SpinLockInit(&LWLockTranches->lock);
- }
+ /* Initialize the dynamic-allocation counter for tranches */
+ LWLockTranches->num_user_defined = 0;
- /* Allocate and initialize the main array */
- numLocks = LWLockTranches->num_main_array_locks;
- MainLWLockArray = (LWLockPadded *)
- ShmemInitStruct("Main LWLock array", numLocks * sizeof(LWLockPadded), &found);
- if (!found)
- {
- /* Initialize all LWLocks */
- InitializeLWLocks(numLocks);
- }
+ SpinLockInit(&LWLockTranches->lock);
+
+ /* Allocate and initialize all LWLocks in the main array */
+ InitializeLWLocks(numLocks);
}
/*
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 61f0dbe749a..efa5b427e9f 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -126,8 +126,6 @@ extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
-extern Size LWLockShmemSize(void);
-extern void LWLockShmemInit(void);
extern void InitLWLockAccess(void);
extern const char *GetLWLockIdentifier(uint32 classId, uint16 eventId);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..f0cf01f5a85 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,11 @@
* of these matter.
*/
-/* TODO: empty for now */
+/*
+ * LWLocks first, in case any of the other shmem init functions use LWLocks.
+ * (Nothing else can be running during startup, so they don't need to do any
+ * locking yet, but we nevertheless allow it.)
+ */
+PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
+
+/* TODO: nothing else for now */
--
2.47.3
[text/x-patch] v11-0007-Use-the-new-mechanism-in-a-few-core-subsystems.patch (46.4K, 8-v11-0007-Use-the-new-mechanism-in-a-few-core-subsystems.patch)
download | inline diff:
From 5f82d2e482593244133b0a54eb52a13abb2e6591 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:21:17 +0300
Subject: [PATCH v11 07/14] Use the new mechanism in a few core subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 35 ++---
src/backend/port/posix_sema.c | 22 ++-
src/backend/port/sysv_sema.c | 21 ++-
src/backend/port/win32_sema.c | 11 +-
src/backend/storage/ipc/dsm.c | 64 +++++----
src/backend/storage/ipc/dsm_registry.c | 36 ++---
src/backend/storage/ipc/ipci.c | 28 ----
src/backend/storage/ipc/latch.c | 8 +-
src/backend/storage/ipc/pmsignal.c | 51 ++++---
src/backend/storage/ipc/procarray.c | 110 +++++++-------
src/backend/storage/ipc/procsignal.c | 64 ++++-----
src/backend/storage/ipc/sinvaladt.c | 38 ++---
src/backend/storage/lmgr/proc.c | 191 +++++++++++++------------
src/backend/utils/hash/dynahash.c | 3 +-
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pg_sema.h | 6 +-
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 17 ++-
25 files changed, 344 insertions(+), 381 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..dc5e32d86f3 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,25 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 40205b7d400..53e4a7a5c38 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -159,22 +159,24 @@ PosixSemaphoreKill(sem_t *sem)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
#ifdef USE_NAMED_POSIX_SEMAPHORES
/* No shared memory needed in this case */
- return 0;
#else
/* Need a PGSemaphoreData per semaphore */
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
#endif
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -193,10 +195,9 @@ PGSemaphoreShmemSize(int maxSemas)
* we don't have to expose the counters to other processes.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -214,11 +215,6 @@ PGReserveSemaphores(int maxSemas)
mySemPointers = (sem_t **) malloc(maxSemas * sizeof(sem_t *));
if (mySemPointers == NULL)
elog(PANIC, "out of memory");
-#else
-
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
#endif
numSems = 0;
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 4b2bf84072f..98d99515043 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -301,16 +301,20 @@ IpcSemaphoreCreate(int numSems)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ /* Need a PGSemaphoreData per semaphore */
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -327,10 +331,9 @@ PGSemaphoreShmemSize(int maxSemas)
* have clobbered.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -344,10 +347,6 @@ PGReserveSemaphores(int maxSemas)
errmsg("could not stat data directory \"%s\": %m",
DataDir)));
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
-
numSharedSemas = 0;
maxSharedSemas = maxSemas;
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index ba97c9b2d64..a3202554769 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -25,17 +25,16 @@ static void ReleaseSemaphores(int code, Datum arg);
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
/* No shared memory needed on Windows */
- return 0;
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* In the Win32 implementation, we acquire semaphores on-demand; the
* maxSemas parameter is just used to size the array that keeps track of
@@ -44,7 +43,7 @@ PGSemaphoreShmemSize(int maxSemas)
* process exits.
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
mySemSet = (HANDLE *) malloc(maxSemas * sizeof(HANDLE));
if (mySemSet == NULL)
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..8b69df4ff26 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -109,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static size_t dsm_main_space_size;
+
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
/*
* List of dynamic shared memory segments used by this backend.
@@ -464,42 +474,40 @@ dsm_set_control_handle(dsm_handle h)
#endif
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ dsm_main_space_size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (dsm_main_space_size == 0)
+ return;
+
+ ShmemRequestStruct(.name = "Preallocated DSM",
+ .size = dsm_main_space_size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
- if (size == 0)
+ if (dsm_main_space_size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (dsm_main_space_size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..2b56977659b 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,20 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRequestStruct(.name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index de65a9ef33c..4f707158303 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -42,16 +41,11 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
-#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -105,14 +99,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -121,11 +111,7 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -278,13 +264,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -307,23 +289,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 8537e9fef2d..7d4f4cf32bb 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -80,10 +80,10 @@ InitLatch(Latch *latch)
* current process.
*
* InitSharedLatch needs to be called in postmaster before forking child
- * processes, usually right after allocating the shared memory block
- * containing the latch with ShmemInitStruct. (The Unix implementation
- * doesn't actually require that, but the Windows one does.) Because of
- * this restriction, we have no concurrency issues to worry about here.
+ * processes, usually right after initializing the shared memory block
+ * containing the latch. (The Unix implementation doesn't actually require
+ * that, but the Windows one does.) Because of this restriction, we have no
+ * concurrency issues to worry about here.
*
* Note that other handles created in this module are never marked as
* inheritable. Thus we do not need to worry about cleaning up child
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..bdad5fdd043 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
+ size_t size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(.name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..f540bb6b23f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,11 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
static bool *KnownAssignedXidsValid;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +387,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +412,49 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(.name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(.name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(.name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index f1ab3aa3fe0..adebf0e7898 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -106,7 +107,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -114,51 +124,39 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(.name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..37a21ffaf1a 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
Size size;
@@ -223,26 +233,18 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(.name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..a05c55b534e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,23 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *AllProcsShmemPtr;
+static void *FastPathLockArrayShmemPtr;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static uint32 TotalProcs;
+static size_t ProcGlobalAllProcsShmemSize;
+static size_t FastPathLockArrayShmemSize;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -83,32 +98,12 @@ static DeadLockState CheckDeadLock(void);
/*
- * Report shared-memory space needed by PGPROC.
+ * Calculate shared-memory space needed by Fast-Path locks.
*/
static Size
-PGProcShmemSize(void)
+CalculateFastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
-/*
- * Report shared-memory space needed by Fast-Path locks.
- */
-static Size
-FastPathLockShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +123,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +136,67 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ProcGlobalAllProcsShmemSize = size;
+ ShmemRequestStruct(.name = "PGPROC structures",
+ .size = ProcGlobalAllProcsShmemSize,
+ .ptr = &AllProcsShmemPtr,
+ );
+
+ if (!IsUnderPostmaster)
+ size = FastPathLockArrayShmemSize = CalculateFastPathLockShmemSize();
+ else
+ size = SHMEM_ATTACH_UNKNOWN_SIZE;
+ ShmemRequestStruct(.name = "Fast-Path Lock Array",
+ .size = size,
+ .ptr = &FastPathLockArrayShmemPtr,
+ );
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ ShmemRequestStruct(.name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .ptr = (void **) &ProcGlobal,
+ );
+
+ /* Let the semaphore implementation register its shared memory needs */
+ PGSemaphoreShmemRequest(ProcGlobalSemas());
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +215,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +244,11 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ ptr = AllProcsShmemPtr;
+ requestSize = ProcGlobalAllProcsShmemSize;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +257,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,30 +272,26 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemPtr;
+ requestSize = FastPathLockArrayShmemSize;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
+ /* Initialize semaphores */
+ PGSemaphoreInit(ProcGlobalSemas());
for (i = 0; i < TotalProcs; i++)
{
@@ -405,7 +412,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +467,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +600,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +669,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index d49a7a92c64..81199edca86 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -338,7 +338,8 @@ string_compare(const char *key1, const char *key2, Size keysize)
* under info->hcxt rather than under TopMemoryContext; the default
* behavior is only suitable for session-lifespan hash tables.
* Other flags bits are special-purpose and seldom used, except for those
- * associated with shared-memory hash tables, for which see ShmemInitHash().
+ * associated with shared-memory hash tables, for which see
+ * ShmemRequestHash().
*
* Fields in *info are read only when the associated flags bit is set.
* It is not necessary to initialize other fields of *info.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 66facc6907a..fe50ee505ba 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -37,11 +37,11 @@ typedef HANDLE PGSemaphore;
#endif
-/* Report amount of shared memory needed */
-extern Size PGSemaphoreShmemSize(int maxSemas);
+/* Request shared memory needed for semaphores */
+extern void PGSemaphoreShmemRequest(int maxSemas);
/* Module initialization (called during postmaster start or shmem reinit) */
-extern void PGReserveSemaphores(int maxSemas);
+extern void PGSemaphoreInit(int maxSemas);
/* Allocate a PGSemaphore structure with initial count 1 */
extern PGSemaphore PGSemaphoreCreate(void);
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 22822fc68d7..3e1d1fad5f9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,8 +552,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index cc4f26aa33d..7f855971b5a 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -67,9 +67,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index f0cf01f5a85..d62c29f1361 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -27,4 +27,19 @@
*/
PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
-/* TODO: nothing else for now */
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
--
2.47.3
[text/x-patch] v11-0008-refactor-predicate.c-inline-SerialInit-to-the-ca.patch (3.6K, 9-v11-0008-refactor-predicate.c-inline-SerialInit-to-the-ca.patch)
download | inline diff:
From f67c99da6f12d031fee851f1f3db81f71389eaeb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH v11 08/14] refactor predicate.c: inline SerialInit to the
caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e003fa5b107..13a6a4b93a6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1355,7 +1312,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.47.3
[text/x-patch] v11-0009-refactor-predicate.c-Move-all-the-initialization.patch (8.3K, 10-v11-0009-refactor-predicate.c-Move-all-the-initialization.patch)
download | inline diff:
From 2b3a806038c406115965ee78b89781f15ed6df6d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH v11 09/14] refactor predicate.c: Move all the initialization
together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 13a6a4b93a6..af03071a71f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1144,19 +1144,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1200,49 +1187,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1278,23 +1222,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1305,8 +1232,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1328,19 +1253,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.47.3
[text/x-patch] v11-0010-Convert-SLRUs-to-use-the-new-interface.patch (84.8K, 11-v11-0010-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From a5d75ada8c91aa777e4e29bb37deef1db1a9ac9d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:32:45 +0300
Subject: [PATCH v11 10/14] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 85 +++---
src/backend/access/transam/multixact.c | 138 +++++----
src/backend/access/transam/slru.c | 366 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 115 ++++---
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 266 +++++++---------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 112 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem_internal.h | 1 +
src/include/storage/subsystemlist.h | 10 +
src/test/modules/test_slru/test_slru.c | 106 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 691 insertions(+), 664 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..7cd1a56201f 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(.desc = &XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..2625cbf93bf 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,7 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +126,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +522,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +548,36 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(.desc = &CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
+ .nslots = CommitTsShmemBuffers(),
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
+
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
+
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..62d58da4abc 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,81 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(.name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(.desc = &MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(.desc = &MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..47dd52d6749 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -70,7 +70,9 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +91,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +103,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +112,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +178,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +198,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +240,135 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = (SlruShared) location;
+ desc->shared = shared;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) location;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +394,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +447,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +463,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +489,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +547,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +651,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +698,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +778,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +792,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +850,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +922,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1054,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1094,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1109,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1124,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1137,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1216,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1308,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1320,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1369,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1439,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1455,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1477,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1505,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1550,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1573,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1650,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1671,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1690,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1747,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1762,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1778,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1794,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1808,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1841,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1849,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1863,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1881,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..b79e648b899 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index e91a62ff42a..db6a9a6561b 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,63 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(.name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(.desc = &NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 4f707158303..7a8c69de802 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
@@ -123,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
@@ -270,10 +264,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -281,11 +271,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -313,7 +298,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 29ff6065dda..bc186d6ea17 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -134,6 +134,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -549,6 +550,9 @@ InitShmemIndexEntry(ShmemRequest *request)
case SHMEM_KIND_HASH:
shmem_hash_init(structPtr, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(structPtr, request->options);
+ break;
}
}
@@ -602,6 +606,9 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
case SHMEM_KIND_HASH:
shmem_hash_attach(index_entry->location, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(index_entry->location, request->options);
+ break;
}
return true;
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index af03071a71f..9c389b23506 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,8 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static int64 max_serializable_xacts;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +456,12 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,71 +1113,53 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
-
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
-
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ ShmemRequestHash(.name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "PREDICATELOCK hash",
+ .nelems = max_predicate_locks,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1177,29 +1172,27 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
-
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "SERIALIZABLEXID hash",
+ .nelems = max_serializable_xacts,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1214,58 +1207,50 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(.name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1277,7 +1262,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1319,6 +1303,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1335,57 +1322,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..36a7514d7a0 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,28 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruDesc SlruDesc;
+
+typedef struct SlruOpts
{
- SlruShared shared;
+ ShmemStructOpts base;
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ /*
+ * name of SLRU. (This is user-visible, pick with care!)
+ */
+ const char *name;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * Pointer to a backend-private handle for the SLRU. It is initialized in
+ * when the SLRU is initialized or attached to.
*/
- bool long_segment_names;
+ SlruDesc *desc;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +136,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +172,26 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
-typedef SlruCtlData *SlruCtl;
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ SlruOpts options;
+
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +200,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(const SlruOpts *options);
+
+#define SimpleLruRequest(...) \
+ SimpleLruRequestWithOpts(&(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(void *location, ShmemStructOpts *options);
+extern void shmem_slru_attach(void *location, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
index fe12bf33439..7b259d33ccf 100644
--- a/src/include/storage/shmem_internal.h
+++ b/src/include/storage/shmem_internal.h
@@ -21,6 +21,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemRequestKind;
/* shmem.c */
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d62c29f1361..c199f18a27a 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -43,3 +50,6 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..40efffdbf62 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,60 +212,46 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
+void
+_PG_init(void)
{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
+ if (!process_shared_preload_libraries_in_progress)
+ ereport(ERROR,
+ (errmsg("cannot load \"%s\" after startup", "test_slru"),
+ errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
+ "test_slru")));
/*
* Create the SLRU directory if it does not exist yet, from the root of
* the data directory.
*/
- (void) MakePGDirectory(slru_dir_name);
+ (void) MakePGDirectory(TestSlruDir);
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
}
-void
-_PG_init(void)
+static void
+test_slru_shmem_request(void *arg)
{
- if (!process_shared_preload_libraries_in_progress)
- ereport(ERROR,
- (errmsg("cannot load \"%s\" after startup", "test_slru"),
- errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
- "test_slru")));
+ SimpleLruRequest(.desc = &TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 63c0b3a9465..3c35097361d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2901,9 +2901,9 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
+SlruOpts
SlruPageStatus
SlruScanCallback
SlruSegState
--
2.47.3
[text/x-patch] v11-0011-Convert-AIO-to-the-new-interface.patch (14.6K, 12-v11-0011-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From 7c01faa030caf2cc60b6cd81a7dd6d310b588cb5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH v11 11/14] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/storage/aio/aio_init.c | 112 +++++++++++++---------
src/backend/storage/aio/method_io_uring.c | 39 ++++----
src/backend/storage/aio/method_worker.c | 84 +++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/include/storage/aio_internal.h | 16 +---
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 143 insertions(+), 117 deletions(-)
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..18bb4235044 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,12 +117,15 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
-
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -132,48 +143,52 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(.name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +223,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 39984df31b4..f8d0b221600 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,31 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +298,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index efe38e9f113..df94a434856 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ pgaio_worker_set_secondary_ptr();
+
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ size_t size;
+ int queue_size;
+
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+
+ ShmemRequestStruct(.name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7a8c69de802..a510c928daa 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -122,7 +122,6 @@ CalculateShmemSize(void)
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
size = add_size(size, DataChecksumsShmemSize());
@@ -301,7 +300,6 @@ CreateOrAttachShmemStructs(void)
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index c199f18a27a..b438794d46d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -53,3 +53,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
--
2.47.3
[text/x-patch] v11-0012-Add-option-for-aligning-shmem-allocations.patch (4.0K, 13-v11-0012-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From 0ef99d0958c12bffa69f5529cce82faf7215d2f3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH v11 12/14] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 26 ++++++++++++++++----------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index bc186d6ea17..973811e545e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -239,7 +239,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -400,7 +400,8 @@ ShmemGetRequestedSize(void)
{
size = add_size(size, request->options->size);
/* calculate alignment padding like ShmemAllocRaw() does */
- size = CACHELINEALIGN(size);
+ size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
+ size);
}
return size;
@@ -525,7 +526,9 @@ InitShmemIndexEntry(ShmemRequest *request)
* We inserted the entry to the shared memory index. Allocate requested
* amount of shared memory for it, and initialize the index entry.
*/
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size,
+ request->options->alignment,
+ &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -754,7 +757,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -773,7 +776,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -783,8 +786,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -800,14 +804,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -821,8 +826,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 147a6915f7e..91218db6d6e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -51,6 +51,12 @@ typedef struct ShmemStructOpts
*/
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
--
2.47.3
[text/x-patch] v11-0013-Convert-buffer-manager-to-the-new-API.patch (15.5K, 14-v11-0013-Convert-buffer-manager-to-the-new-API.patch)
download | inline diff:
From 3c345ff27247f0ce577fdc60d7b909e51fa0eb34 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:44:02 +0300
Subject: [PATCH v11 13/14] Convert buffer manager to the new API
---
src/backend/storage/buffer/buf_init.c | 149 ++++++++++---------------
src/backend/storage/buffer/buf_table.c | 54 +++++----
src/backend/storage/buffer/freelist.c | 93 +++++----------
src/backend/storage/ipc/ipci.c | 3 -
src/include/storage/buf_internals.h | 5 -
src/include/storage/bufmgr.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 124 insertions(+), 187 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..1407c930c56 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,31 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ ShmemRequestStruct(.name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(.name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(.name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +104,50 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(.name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index d04ef74b850..347bf267d73 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -22,6 +22,7 @@
#include "postgres.h"
#include "storage/buf_internals.h"
+#include "storage/subsystems.h"
/* entry for buffer lookup hashtable */
typedef struct
@@ -32,37 +33,42 @@ typedef struct
static HTAB *SharedBufHash;
+static void BufTableShmemRequest(void *arg);
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
+const ShmemCallbacks BufTableShmemCallbacks = {
+ .request_fn = BufTableShmemRequest,
+ /* no special initialization needed, the hash table will start empty */
+};
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(void *arg)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ int size;
+
+ /*
+ * Request the shared buffer lookup hashtable.
+ *
+ * Since we can't tolerate running out of lookup table entries, we must be
+ * sure to specify an adequate table size here. The maximum steady-state
+ * usage is of course NBuffers entries, but BufferAlloc() tries to insert
+ * a new entry before deleting the old. In principle this could be
+ * happening in each partition concurrently, so we could need as many as
+ * NBuffers + NUM_BUFFER_PARTITIONS entries.
+ */
+ size = NBuffers + NUM_BUFFER_PARTITIONS;
+
+ ShmemRequestHash(.name = "Shared Buffer Lookup Table",
+ .nelems = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..fdb5bad7910 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,80 +379,35 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- request shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
-
- return size;
+ ShmemRequestStruct(.name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
}
/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
*/
-void
-StrategyInitialize(bool init)
+static void
+StrategyCtlShmemInit(void *arg)
{
- bool found;
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /*
- * Initialize the shared buffer lookup hashtable.
- *
- * Since we can't tolerate running out of lookup table entries, we must be
- * sure to specify an adequate table size here. The maximum steady-state
- * usage is of course NBuffers entries, but BufferAlloc() tries to insert
- * a new entry before deleting the old. In principle this could be
- * happening in each partition concurrently, so we could need as many as
- * NBuffers + NUM_BUFFER_PARTITIONS entries.
- */
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
-
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
-
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a510c928daa..f64c1d59fa3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -39,7 +39,6 @@
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
@@ -99,7 +98,6 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
@@ -263,7 +261,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- BufferManagerShmemInit();
/*
* Set up lock manager
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..89615a254a3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,7 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index b438794d46d..d8e11756a61 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -36,6 +36,9 @@ PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
--
2.47.3
[text/x-patch] v11-0014-Convert-all-remaining-subsystems-to-use-the-new-.patch (110.5K, 15-v11-0014-Convert-all-remaining-subsystems-to-use-the-new-.patch)
download | inline diff:
From f501ac52d619c9380e1f3e6fe5c7a18e6f4f14c2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH v11 14/14] Convert all remaining subsystems to use the new API
---
src/backend/access/common/syncscan.c | 76 ++++----
src/backend/access/nbtree/nbtutils.c | 54 +++---
src/backend/access/transam/twophase.c | 75 ++++----
src/backend/access/transam/xlog.c | 82 +++++----
src/backend/access/transam/xlogprefetcher.c | 51 +++---
src/backend/access/transam/xlogrecovery.c | 35 ++--
src/backend/access/transam/xlogwait.c | 50 ++---
src/backend/postmaster/autovacuum.c | 79 ++++----
src/backend/postmaster/bgworker.c | 105 +++++------
src/backend/postmaster/checkpointer.c | 56 +++---
src/backend/postmaster/datachecksum_state.c | 41 ++---
src/backend/postmaster/pgarch.c | 43 +++--
src/backend/postmaster/walsummarizer.c | 60 +++---
src/backend/replication/logical/launcher.c | 56 +++---
src/backend/replication/logical/logicalctl.c | 29 ++-
src/backend/replication/logical/origin.c | 59 +++---
src/backend/replication/logical/slotsync.c | 41 +++--
src/backend/replication/slot.c | 64 +++----
src/backend/replication/walreceiverfuncs.c | 51 +++---
src/backend/replication/walsender.c | 59 +++---
src/backend/storage/ipc/ipci.c | 124 +------------
src/backend/storage/lmgr/lock.c | 113 +++++-------
src/backend/utils/activity/backend_status.c | 173 +++++++-----------
src/backend/utils/activity/pgstat_shmem.c | 158 ++++++++--------
src/backend/utils/activity/wait_event.c | 83 ++++-----
src/backend/utils/misc/injection_point.c | 57 +++---
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/datachecksum_state.h | 4 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 27 +++
src/include/utils/backend_status.h | 8 -
src/include/utils/injection_point.h | 3 -
src/include/utils/wait_event.h | 2 -
.../injection_points/injection_points.c | 59 ++----
src/test/modules/test_aio/test_aio.c | 107 +++++------
54 files changed, 933 insertions(+), 1206 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..0f9eb167bed 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,47 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ ShmemRequestStruct(.name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..014faa1622f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,37 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(.name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..836928180a9 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,10 +243,10 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
Size size;
@@ -248,46 +257,40 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(.name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e8999bbb61..bbc565509b0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -96,6 +96,7 @@
#include "storage/procsignal.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -571,6 +572,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -579,6 +590,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -5257,7 +5269,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
SetLocalDataChecksumState(ControlFile->data_checksum_version);
}
@@ -5274,10 +5287,10 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
Size size;
@@ -5317,23 +5330,24 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(.name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5351,36 +5365,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5442,6 +5437,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..83a3f97a57c 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,25 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +337,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..c236e2b7969 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,20 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..2e31c0d67d7 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,34 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(.name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..250c43b85e5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3545,11 +3554,11 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
Size size;
@@ -3560,53 +3569,41 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(.name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 536aff7ca05..0992b9b6353 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -30,6 +30,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -110,6 +111,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -160,10 +169,10 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
Size size;
@@ -171,66 +180,58 @@ BackgroundWorkerShmemSize(void)
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..6b424ee610f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,11 +959,11 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
Size size;
@@ -967,39 +976,24 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(.name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index 76004bcedc6..eb7b01d0993 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -211,6 +211,7 @@
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -346,6 +347,7 @@ static volatile sig_atomic_t launcher_running = false;
static DataChecksumsWorkerOperation operation;
/* Prototypes */
+static void DataChecksumsShmemRequest(void *arg);
static bool DatabaseExists(Oid dboid);
static List *BuildDatabaseList(void);
static List *BuildRelationList(bool temp_relations, bool include_shared);
@@ -356,6 +358,10 @@ static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferA
static void launcher_cancel_handler(SIGNAL_ARGS);
static void WaitForAllTransactionsToFinish(void);
+const ShmemCallbacks DataChecksumsShmemCallbacks = {
+ .request_fn = DataChecksumsShmemRequest,
+};
+
/*****************************************************************************
* Functionality for manipulating the data checksum state in the cluster
*/
@@ -1236,35 +1242,16 @@ ProcessAllDatabases(void)
}
/*
- * DataChecksumStateSize
- * Compute required space for datachecksumsworker-related shared memory
- */
-Size
-DataChecksumsShmemSize(void)
-{
- Size size;
-
- size = sizeof(DataChecksumsStateStruct);
- size = MAXALIGN(size);
-
- return size;
-}
-
-/*
- * DataChecksumStateInit
- * Allocate and initialize datachecksumsworker-related shared memory
+ * DataChecksumShmemRequest
+ * Request datachecksumsworker-related shared memory
*/
-void
-DataChecksumsShmemInit(void)
+static void
+DataChecksumsShmemRequest(void *arg)
{
- bool found;
-
- DataChecksumState = (DataChecksumsStateStruct *)
- ShmemInitStruct("DataChecksumsWorker Data",
- DataChecksumsShmemSize(),
- &found);
- if (!found)
- MemSet(DataChecksumState, 0, DataChecksumsShmemSize());
+ ShmemRequestStruct(.name = "DataChecksumsWorker Data",
+ .size = sizeof(DataChecksumsStateStruct),
+ .ptr = (void **) &DataChecksumState,
+ );
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..0a1a1149d78 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,31 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- size = add_size(size, sizeof(PgArchData));
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
- return size;
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
}
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
{
- bool found;
-
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..20960f5b633 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,34 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ ShmemRequestStruct(.name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..9e75a3e04ee 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,11 +981,11 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
Size size;
@@ -987,7 +996,10 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(.name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1040,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..72f68ec58ef 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,13 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
-
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(.name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..372d77c475e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,48 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(.name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..d615ff8a81c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,26 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ ShmemRequestStruct(.name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..21a213a0ebf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,41 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..4e03e721872 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,46 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
-
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2bb3f34dc6d..ec39942bfc1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3765,47 +3774,37 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index f64c1d59fa3..bf6b81e621b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,41 +14,16 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/datachecksum_state.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/shmem_internal.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -57,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -97,33 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
- size = add_size(size, DataChecksumsShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -157,7 +103,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -204,9 +149,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -237,70 +179,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
- DataChecksumsShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitEventCustomShmemInit();
- InjectionPointShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 798c453ab38..68d5a0389df 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,21 +443,15 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_table_size;
- bool found;
/*
* Compute sizes for lock hashtables. Note that these calculations must
@@ -455,45 +460,48 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
/*
- * Allocate hash table for LOCK structs. This stores per-locked-object
+ * Hash table for LOCK structs. This stores per-locked-object
* information.
*/
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "LOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_FIXED_SIZE | HASH_PARTITION);
+ ShmemRequestHash(.name = "PROCLOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(.name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin. size =
+ * add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3758,29 +3766,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /* fast-path structures */
- size = add_size(size, sizeof(FastPathStrongRelationLockData));
-
- return size;
-}
-
/*
* GetLockStatusData - Return a summary of the lock manager's internal
* status, for use in a user-level reporting function.
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..4cb9c80a2c5 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,97 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(.name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(.name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(.name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
#endif
+
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
- }
-
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
- {
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +177,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +191,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..955faf5ebc7 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -149,102 +157,98 @@ StatsShmemSize(void)
return sz;
}
+/*
+ * Register shared memory area for cumulative statistics
+ */
+static void
+StatsShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
+
/*
* Initialize cumulative statistics system during startup
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemInit(void *arg)
{
- bool found;
- Size sz;
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- Assert(!found);
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ if (!kind_info)
+ continue;
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
{
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
-
- if (!kind_info)
- continue;
-
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
-
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 2b76967776c..95635c7f56c 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -95,59 +96,47 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRequestStruct(.name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..a7c99e097ea 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,45 +232,34 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
+}
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
}
+#endif
/*
* Attach a new injection point.
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4af38e74ce4..437b4f32349 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,8 +259,6 @@ extern void InitLocalDataChecksumState(void);
extern void SetLocalDataChecksumState(uint32 data_checksum_version);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/datachecksum_state.h b/src/include/postmaster/datachecksum_state.h
index 343494edcc8..05625539604 100644
--- a/src/include/postmaster/datachecksum_state.h
+++ b/src/include/postmaster/datachecksum_state.h
@@ -17,10 +17,6 @@
#include "storage/procsignal.h"
-/* Shared memory */
-extern Size DataChecksumsShmemSize(void);
-extern void DataChecksumsShmemInit(void);
-
/* Possible operations the Datachecksumsworker can perform */
typedef enum DataChecksumsWorkerOperation
{
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d8e11756a61..5e092552c72 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,9 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
@@ -40,12 +43,18 @@ PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
+
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -53,9 +62,27 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(DataChecksumsShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..0f1af513673 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,23 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(.name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +584,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..35efba1a5e3 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,8 +74,15 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
@@ -83,70 +90,55 @@ static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(.name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -155,10 +147,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-04 23:17 Matthias van de Meent <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 3 replies; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-04 23:17 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, 4 Apr 2026 at 19:32, Heikki Linnakangas <[email protected]> wrote:
>
> On 04/04/2026 15:00, Matthias van de Meent wrote:
> > On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <[email protected]> wrote:
> >>>> I don't understand the use of ShmemStructDesc. They generally/always
> >>>> are private to request_fn(), and their fields are used exclusively
> >>>> inside the shmem mechanisms, with no reads of its fields that can't
> >>>> already be deduced from context. Why do we need that struct
> >>>> everywhere?
> >>>
> >>> My resizable shared memory structure patches use it as a handle to the
> >>> structure to be resized.
> >>
> >> Right. And hash tables and SLRUs use a desc-like object already, so for
> >> symmetry it feels natural to have it for plain structs too.
> >> I wonder if we should make it optional though, for the common case that
> >> you have no intention of doing anything more with the shmem region that
> >> you'd need a desc for. I'm thinking you could just pass NULL for the
> >> desc pointer:
> >>
> >> ShmemRequestStruct(NULL,
> >> .name = "pg_stat_statements",
> >> .size = sizeof(pgssSharedState),
> >> .ptr = (void **) &pgss,
> >> };
> >
> > That would help, though I'd still wonder why we'd have separate Opts
> > and Desc structs. IIUC, they generally carry (exactly) the same data.
> >
> > Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
> > make that part of the code a bit cleaner; as it'd further clarify that
> > it's very much an optional field.
>
> Yeah. OTOH, I'd like to separate the options from what's effectively a
> return value. But maybe you're right and it's nevertheless better that way.
>
> Some options on this:
>
> a) What's in the patch now
[...]
> b) Allow passing NULL for the desc
[...]
> c) Return the Desc as a return value
[...]
> In option c) you can just throw away the result if you don't need it. I
> kind of like this as a notational thing. However it has some downsides:
>
> This changes the return value to be a pointer. I'm thinking that
> ShmemRequestStruct() palloc's the descriptor struct in TopMemoryContext.
> This is a little ugly because the descriptor struct is leaked if the
> caller throws it away. It's not a lot of memory, but still.
Yeah, it'd be bad if we'd leak it, as it could cause some
semipermanent memory leaks when the server keeps restarting after
crash without resetting TopMemoryContext.
> d) Make it part of Opts, as you suggested
[...]
> In the attached new version, though, I stepped back and decided to
> remove the whole ShmemStructDesc after all. I still think having a
> handle like that is a good idea, and the follow-up patches for resizing
> need it. However, with option d) it can easily be added later. With
> option d), it seems silly to have it be part of the patch now, when the
> desc struct doesn't really do anything.
Thanks!
> Other changes in this patch version:
>
> - I moved some of the stuff from shmem.h to a new shmem_internal.h
> header. The idea is that what remains in shmem.h provides the public API
> for allocating shared memory.
>
> - I refactored the "after-startup request" code. It now detects the case
> that some of the shmem areas, but not all, have already been initialized
> and throws an error.
>
> Still processing the rest of the feedback from the past days. This patch
> version is also available at
> https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-11.
Thanks!
> On Sat, 4 Apr 2026 at 02:49, Heikki Linnakangas <[email protected]> wrote:
> > Those are now committed, and here's a new version rebased over those
> > changes. The hash options is now called 'nelems', and the 'extra_size'
> > in ShmemStructOpts is gone.
> >
> > Plus a bunch of other fixes and cleanups. I also reordered and
> > re-grouped the patches a little, into more logical increments I hope.
>
> 0004-0014: TBD
Review continued, based on v11.
0004: LGTM, with some nits:
> + * This is called at postmaster startup. Note that the shared memory isn't
> + * allocated here yet, this merely register our needs.
Typo: register -> registers
Formatting:
> + ShmemRequestHash(.name = "pg_stat_statements hash",
> + .nelems = pgss_max,
> + .hash_info.keysize = sizeof(pgssHashKey),
> + .hash_info.entrysize = sizeof(pgssEntry),
> + .hash_flags = HASH_ELEM | HASH_BLOBS,
> + .ptr = &pgss_hash,
> + );
(note that additional unit of indentation for the closing bracket)
Is this malformatting caused by pgindent? If so, could you see if
there's a better way of defining ShmemRequestHash/Struct that doesn't
have this indent as output?
> + pgss->extent = 0;
> + pgss->n_writers = 0;
> + pgss->gc_count = 0;
> + pgss->stats.dealloc = 0;
Shmem is said to be zero-initialized, should we remove the manual
zero-initialization?
> + on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
See my upthread comment about adding optional on_shmem_exit callbacks
to ShmemCallbacks.
0005: LGTM
0006: I don't think it is a great idea to make the LwLock machinery
the first to get allocation requests:
It has the RequestNamedLWLockTranche infrastructure, which can only
register new requests while process_shmem_requests_in_progress, and
making it request its memory ahead of everything else is likely to
cause an undersized tranche to be allocated. You could make sure that
this isn't an issue by maintaining a flag in lwlock.c that's set when
the shmem request is made (and reset on shmem exit), which must be
false when RequestNamedLWLockTranche() is called, and if not then it
should throw an error.
0007: LGTM, Nits:
Patch description: ProgGlobal -> ProcGlobal
> %_sema.c
Not my favourite pieces of code, but that's not your patch's fault. To
me, it seems this code area has too much duplication, but that's not
something you have to fix.
0008: LGTM
> +#ifdef USE_ASSERT_CHECKING
> + SerialPagePrecedesLogicallyUnitTests();
> +#endif
Huh, interesting. I hadn't seen such inline unit testing before.
Mailing list history seems to agree with this, so, TIL.
0009: LGTM
0010: Not looked at everything yet, but a few comment:
> +++ b/src/include/access/slru.h
With the changes in the signatures for most/all SLRU functions from a
hidden-by-typedef pointer to a visible pointer type, maybe this could
be an opportunity to swap them to `const SlruDesc *ctl` wherever
possible? I don't think there are many backend-local changes that
happen to SlruDescs once we've properly started the backend. I'm happy
to provide an incremental patch if you'd like me to spend cycles on it
if you're busy.
> +++ b/src/backend/access/transam/clog.c
> + SimpleLruRequest(.desc = &XactSlruDesc,
> + .name = "transaction",
> + .Dir = "pg_xact",
> + .long_segment_names = false,
> +
> + .nslots = CLOGShmemBuffers(),
> + .nlsns = CLOG_LSNS_PER_PAGE,
> +
> + .sync_handler = SYNC_HANDLER_CLOG,
> + .PagePrecedes = CLOGPagePrecedes,
> + .errdetail_for_io_error = clog_errdetail_for_io_error,
That awfully inconsistent field name styling is ... awful, but not
this patch's fault. If something can be done about it in a cheap
fashion in this patch, that'd be great, but I won't hold it against
you if that's skipped.
> +++ b/src/backend/access/transam/multixact.c
> static void
> MultiXactShmemRequest(void *arg)
> [...]
> + /*
> + * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
> + * tests
> + */
I think this comment is misplaced, it should probably be put in
MultiXactShmemInit(), below MultiXactOffset's UnitTests (which is just
a few lines below its current location).
The rest of 0010; all of 0011-0014: TBD
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 05:48 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 2 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 05:48 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sat, Apr 4, 2026 at 11:02 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 04/04/2026 15:00, Matthias van de Meent wrote:
> > On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <[email protected]> wrote:
> >>>> I don't understand the use of ShmemStructDesc. They generally/always
> >>>> are private to request_fn(), and their fields are used exclusively
> >>>> inside the shmem mechanisms, with no reads of its fields that can't
> >>>> already be deduced from context. Why do we need that struct
> >>>> everywhere?
> >>>
> >>> My resizable shared memory structure patches use it as a handle to the
> >>> structure to be resized.
> >>
> >> Right. And hash tables and SLRUs use a desc-like object already, so for
> >> symmetry it feels natural to have it for plain structs too.
> >> I wonder if we should make it optional though, for the common case that
> >> you have no intention of doing anything more with the shmem region that
> >> you'd need a desc for. I'm thinking you could just pass NULL for the
> >> desc pointer:
> >>
> >> ShmemRequestStruct(NULL,
> >> .name = "pg_stat_statements",
> >> .size = sizeof(pgssSharedState),
> >> .ptr = (void **) &pgss,
> >> };
> >
> > That would help, though I'd still wonder why we'd have separate Opts
> > and Desc structs. IIUC, they generally carry (exactly) the same data.
> >
> > Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
> > make that part of the code a bit cleaner; as it'd further clarify that
> > it's very much an optional field.
>
> Yeah. OTOH, I'd like to separate the options from what's effectively a
> return value. But maybe you're right and it's nevertheless better that way.
>
> Some options on this:
>
> a) What's in the patch now
>
> static ShmemStructDesc pgssSharedStateDesc;
>
> ShmemRequestStruct(&pgssSharedStateDesc,
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss);
>
> b) Allow passing NULL for the desc
>
> ShmemRequestStruct(NULL,
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss);
>
> c) Return the Desc as a return value
>
> static ShmemStructDesc *pgssSharedStateDesc;
>
> pgssSharedStateDesc =
> ShmemRequestStruct(.name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss);
>
> In option c) you can just throw away the result if you don't need it. I
> kind of like this as a notational thing. However it has some downsides:
>
> This changes the return value to be a pointer. I'm thinking that
> ShmemRequestStruct() palloc's the descriptor struct in TopMemoryContext.
> This is a little ugly because the descriptor struct is leaked if the
> caller throws it away. It's not a lot of memory, but still.
>
> I'm also not sure how well this fits in with the SLRU code. On 'master',
> you already have SlruCtlData which is like the "desc" struct. Would we
> turn that into a pointer too, adding one indirection to all the SLRU
> calls. It's probably fine from a performance point of view, but it feels
> like it's going in the wrong direction.
>
> d) Make it part of Opts, as you suggested
>
> static ShmemStructDesc pgssSharedStateDesc;
>
> ShmemRequestStruct(.name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> .desc = &pgssSharedStateDesc);
>
> In the attached new version, though, I stepped back and decided to
> remove the whole ShmemStructDesc after all. I still think having a
> handle like that is a good idea, and the follow-up patches for resizing
> need it. However, with option d) it can easily be added later. With
> option d), it seems silly to have it be part of the patch now, when the
> desc struct doesn't really do anything. SLRU's still have a similar
> SlruDesc struct, however. For SLRUs it's essentially the same as the old
> SlruCtlData struct before these patches.
>
> The Desc structs were being used for one thing though: I used the 'size'
> from the Desc struct in ProcGlobalShmemInit() to get the allocated size
> of each shmem area. The size computation there is complicated enough
> that I'd rather not repeat it, and avoiding the repeated size
> calculation was the raison d'être for these patches. I replaced it with
> global variables to hold the sizes from the ShmemRequest() step to
> ShmemInit(). But that would be one case where having the desc would
> already be useful. Then again, I'm not sure we want to expose the 'size'
> in the descriptor like that anyway, because as soon as we make shmem
> regions resizable, we might not be able to keep the size in the
> descriptor up-to-date. The size of these structs won't change, but we
> might not want to expose the information because it would be confusing
> for other structs where it can change to show outdated information.
>
> On a related note, when we add back the ".desc" concept later, is
> ".desc" a good name, or ".handle" as you also suggested? More widely, do
> we call the concept and the struct a "handle" or "descriptor" or what?
> Or if we follow the precedence with the existing SlruCtlData struct, it
> could be ".ctl". I'm not a fan of the "Ctl" naming though, because we
> already have a lot of structs with "Ctl" in the name and it's not always
> clear whether a "Ctl" struct refers to the shared memory parts or the
> handle to it. Now that the "desc" structs are not part of these patches
> anymore, however, we can punt on that decision.
Resizing patches can do without Desc, they use name has the handle
instead. I was not comfortable with current state of Desc either
because they are not opaque as I had pointed out earlier. A caller can
scribble on them. There is not need to decide on the handle decision
right now, even for resizing patches. If we decide to add a handle, I
would like it to be opaque. I thought about using ShmemIndexEnt *
itself as the opaque pointer; we shouldn't expose it to the users of
shmem.c that it's ShmemIndexEnt * though. There is downside that we
are giving a much riskier handle in the shmem.c users' hands - they
can now corrupt shared memory itself. We could encapsulate the
ShmemIndexEntry * like how HTAB encapsulates HASHHDR if needed.
Advantage of this approach is that ShmemResizeStruct() or any shmem.c
API accepting the handle doesn't need to perform a ShmemIndex lookup.
Just ideas, nothing required right now.
>
> On 02/04/2026 09:58, Ashutosh Bapat wrote:
> >>
> >> I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init'
> >> and 'may_attach'. But more importantly I added comments to explain the
> >> different usages. Hope that helps..
> >
> > The explanation in the prologue looks good. But the function is still
> > confusing. Instead of if ... else fi ... chain, I feel organizing this
> > as below would make it more readable. (this was part of one of my
> > earlier edit patches).
> > if (found)
> > ...
> > else
> > {
> > if (!may_init)
> > error
> > if (!index_entry)
> > error
> >
> > ... rest of the code to initialize and attach
> > }
> >
> > But other than that I don't have any other brilliant ideas.
>
> I did another refactoring in this area: I split
> AttachOrInitShmemIndexEntry() into separate AttachShmemIndexEntry() and
> InitShmemIndexEntry functions again. There's a little bit of repetition
> that way, but IMO it makes it much clearer overall.
>
Yes.
I will post my resizable shmem structures patch in a separate email in
this thread but continue to review your patches.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 05:58 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 05:58 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
<[email protected]> wrote:
>
> I will post my resizable shmem structures patch in a separate email in
> this thread but continue to review your patches.
>
Attached is your patchset (0001 - 0014) + resizable shared memory
structures patchset 0015.
Resizable shared memory structures
============================
When allocating memory to the requested shared structures, we allocate
space for each structure. In mmap'ed shared memory, the memory is
allocated against those structures only when those structures are
initialized.
Resizable shared memory structures are simply allocated maximum space
when that happens. The function which initializes the structure is
expected to initialize only the memory worth its initial size. When
resizing the structure memory is freed or allocated against the
reserved space depending upon the new size. This allows the structures
to be resized while keeping their starting address stable which is a
hard requirement in PostgreSQL.
Resizable shared memory feature depends upon the existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check the existence of madvise
separately, since the existence of the constants implies the existence
of the function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since
that's largely used for Windows where the APIs to free and allocate
memory from and to a given address space are not known to the author
right now. Given that PostgreSQL is used widely on Linux, providing
this feature on Linux covers benefits most of its users. Once we
figure out the required Windows APIs, we will support this feature on
Windows as well.
The feature is also not available when Sys-V shared memory is used
even on Linux since we do not know whether required Sys-V APIs exist;
mostly they don't. Since that combination is only available for
development and testing, not supporting the feature isn't going to
impact PostgreSQL users much.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not
support the feature. But we also have run time checks to disable this
feature when Sys-V shared memory is used. In order to know whether a
given instance of a running server supports resizable structures, we
have introduced GUC have_resizable_shmem.
Following points are up for discussion
=============================
1. calculation of allocated_size of resizable structures
--------------------------------
For fixed sized shared memory structures, allocated_size is the size
of the aligned structure. Assuming that the whole structure is
initialized, it is also the memory allocated to the structure. Thus
summing all allocated_size's of the allocations gives a
nearly-accurate (considering page sized allocations) idea of the total
shared memory allocated. For a resizable structure, it's a bit more
complicated. We allocate maximum space required by the structure at
the beginning. At a given point in time, the memory page where the
next structure begins and the page which contains the end of the
structure at that point in time are allocated. The pages in-between
are not allocated. The memory allocated to that structure is the
{maximum size of the structure} - {total size of unallocated pages}. I
think setting allocated_size to the actually allocated memory is more
accurate than {current size of the structure} + {alignment} which does
not reflect the actual memory allocated to the structure. I would like
to know what others think.
2. maximum_size member in various structures and in pg_shmem_allocations view
-----------------------------------------------------------------------------
A resizable structure is requested by specifying non-zero maximum_size
in ShmemStructOpts. It gets copied to the maximum_size member in
ShmemStructDesc, ShmemIndexEnt. The question is for fixed-size
structures what should be the value maximum_size in those structures.
Setting it to the same value as the size member in the respective
structure is logical since their maximum size is the same as their
initial size. But if we do so, we need another member in
ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
resizable or not. Instead the patches set maximum_size to 0 for
fixed-size structures and non-zero for resizable structures. This way
we can check whether a structure is resizable or not by checking
whether its maximum_size is zero or not. pg_shmem_allocations view
also has a maximum_size column which has the similar characteristics.
I would like to know what others think.
3. allocated_space member in various structures and in pg_shmem_allocations view
-------------------------------------------------------------------------------
The patch adds a new member allocated_space to ShmemIndexEnt and
pg_shmem_allocations view. allocated_space to maximum_size is what
allocated_size is to size - it's the type aligned value of
maximum_size. But it also highlights the difference between the
address space allocation and the actual memory allocation. This
difference is crucial to resizable structures. However, unlike
maximum_size, we set it to a non-zero value, allocated_size, for
fixed-size structures as well since they are allocated the same amount
of space as their allocated_size. While this seems logically correct
to me, some may find maximum_size to be zero but allocated_space to be
non-zero for fixed-size structures a bit weird. I would like to know
what others think.
As a minor point, setting allocated_space to allocated_size makes the
calculations in pg_shmem_allocations() a bit easier. However, that can
be fixed trivially.
As a side question, do we want to allow users to specify minimum_size
in ShmemStructOpts for resizable structures? Resizing memory lower
than that would be prohibited. For fixed sized structures,
minimum_size would be same as size and also maximum_size. For now, it
seems only for the sanity checks, but it could be seen as a useful
safety feature. A difference in maximum_size and minimum_size would
indicate that the structure is resizable.
Considering 2 and 3 together, we have the following options
a. As implemented in patch and clarified in documentation.
b. Set maximum_size to size and allocated_space to allocated_size for
fixed-size structures, but add a new member to indicate whether the
structure is resizable or not.
c. Set maximum_size and allocated_space to zero for fixed-size
structures and explicitly mention it in the documentation.
4. to mprotect or not to mprotect
---------------------------------
If memory beyond the current size of a resizable structure is
accessed, it won't cause any segfault or bus error. When writing
memory will be simply allocated and when reading, it will return
zeroes if memory is not allocated yet. mprotect'ing the memory beyond
the current size of a resizable structure to PROT_NONE can prevent
accidental access to unallocated memory (sans page boundaries), but it
needs to be done in every backend process which requires a
synchronization mechanism beyond the scope of shmem.c. Hence the patch
does not use mprotect. A subsystem will require some higher level
synchronization mechanism between users of the structure and the
process which resizes it. That synchronization mechanism can be used
to mprotect the memory, if required. I have documented this, but I
would like to know whether we should provide an API in shmem.c to
mprotect.
6. Tests
-------
The patch adds a new test module resizable_shmem which tests the
resizable shared memory feature. Also it adds a test case to the
test_shmem module to make sure that the fixed-size shared memory
structures can not be resized. I think the resizable_shmem module
should be merged into test_shmem. But I have kept these two separate
for ease of review. Please let me know if you also think they should
be merged.
I have self-reviewed the tests a few times, fixing issues and
adjusting the test and module code. But it could help with some more
review. However, I wanted to get the patch out for review, given the
looming deadline. Similarly for the commit message.
I am adding this to CF so that it gets some CI coverage especially on
the platforms which do not support resizable shared memory.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] v20260405-0001-refactor-Move-ShmemInitHash-to-separate-fi.patch (11.1K, 2-v20260405-0001-refactor-Move-ShmemInitHash-to-separate-fi.patch)
download | inline diff:
From 6783e74145e4b88dca86f4bd432d618b3d389bd9 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH v20260405 01/15] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 108 ----------------------
src/backend/storage/ipc/shmem_hash.c | 130 +++++++++++++++++++++++++++
src/include/storage/shmem.h | 9 +-
5 files changed, 139 insertions(+), 110 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cb51ad62f8..c994f7674ec 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -96,9 +96,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -257,29 +254,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/*
- * ShmemHashAlloc -- alloc callback for shared memory hash tables
- *
- * Carve out the allocation from a pre-allocated region. All shared memory
- * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
- * happen upfront during initialization and no locking is required.
- */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
- void *result;
-
- size = MAXALIGN(size);
-
- if (allocator->end - allocator->next < size)
- return NULL;
- result = allocator->next;
- allocator->next += size;
-
- return result;
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -341,88 +315,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * nelems is the maximum number of hashtable entries.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 nelems, /* size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- size_t size;
- void *location;
-
- size = hash_estimate_size(nelems, infoP->entrysize);
-
- /* look it up in the shmem index or allocate */
- location = ShmemInitStruct(name, size, &found);
-
- return shmem_hash_create(location, size, found,
- name, nelems, infoP, hash_flags);
-}
-
-/*
- * Initialize or attach to a shared hash table in the given shmem region.
- *
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
- */
-static HTAB *
-shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-{
- shmem_hash_allocator allocator;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory and have
- * all elements allocated upfront. We don't support growing because we'd
- * need to grow the underlying shmem region with it.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (!found)
- {
- allocator.next = (char *) location;
- allocator.end = (char *) location + size;
- infoP->alloc_arg = &allocator;
- }
- else
- {
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
- hash_flags |= HASH_ATTACH;
- }
-
- return hash_create(name, nelems, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..0b05730129e
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,130 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+static void *ShmemHashAlloc(Size size, void *alloc_arg);
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * nelems is the maximum number of hashtable entries.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 nelems, /* size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ size_t size;
+ void *location;
+
+ size = hash_estimate_size(nelems, infoP->entrysize);
+
+ /* look it up in the shmem index or allocate */
+ location = ShmemInitStruct(name, size, &found);
+
+ return shmem_hash_create(location, size, found,
+ name, nelems, infoP, hash_flags);
+}
+
+/*
+ * Initialize or attach to a shared hash table in the given shmem region.
+ *
+ * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
+ * share the logic for bootstrapping the ShmemIndex hash table.
+ */
+HTAB *
+shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+{
+ shmem_hash_allocator allocator;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory and have
+ * all elements allocated upfront. We don't support growing because we'd
+ * need to grow the underlying shmem region with it.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (!found)
+ {
+ allocator.next = (char *) location;
+ allocator.end = (char *) location + size;
+ infoP->alloc_arg = &allocator;
+ }
+ else
+ {
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+ hash_flags |= HASH_ATTACH;
+ }
+
+ return hash_create(name, nelems, infoP, hash_flags);
+}
+
+/*
+ * ShmemHashAlloc -- alloc callback for shared memory hash tables
+ *
+ * Carve out the allocation from a pre-allocated region. All shared memory
+ * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
+ * happen upfront during initialization and no locking is required.
+ */
+static void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
+ void *result;
+
+ size = MAXALIGN(size);
+
+ if (allocator->end - allocator->next < size)
+ return NULL;
+ result = allocator->next;
+ allocator->next += size;
+
+ return result;
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index a2eb499d63c..82f5403c952 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -30,15 +30,20 @@ typedef struct PGShmemHeader PGShmemHeader; /* avoid including
extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
+extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+/* shmem_hash.c */
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
base-commit: f63ca3379025ee4547865182da6cae14aec35d58
--
2.34.1
[text/x-patch] v20260405-0002-Introduce-a-new-mechanism-for-registering-.patch (62.3K, 3-v20260405-0002-Introduce-a-new-mechanism-for-registering-.patch)
download | inline diff:
From d2d065d866f7275b8e6706757240e39a0e34d2c0 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:01:39 +0300
Subject: [PATCH v20260405 02/15] Introduce a new mechanism for registering
shared memory areas
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register or "request" the subsystem's shared memory
needs. This is more ergonomic, as you only need to calculate the size
once.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 5 +
src/backend/postmaster/postmaster.c | 19 +-
src/backend/storage/ipc/ipci.c | 30 +-
src/backend/storage/ipc/shmem.c | 832 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 86 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 10 +-
src/include/storage/shmem.h | 183 +++++-
src/include/storage/shmem_internal.h | 52 ++
src/tools/pgindent/typedefs.list | 9 +-
13 files changed, 1190 insertions(+), 207 deletions(-)
create mode 100644 src/include/storage/shmem_internal.h
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c52c0a6023d..c707ccfa563 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -373,6 +374,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..0973010b7dc 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,7 +49,9 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +674,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eb4f3eb72d4..693475014fe 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -115,6 +115,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "tcop/tcopprot.h"
#include "utils/datetime.h"
@@ -951,7 +952,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3232,7 +3240,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7aab5da3386..24422a80ab3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -100,8 +101,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -176,6 +178,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -220,7 +229,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -251,11 +270,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c994f7674ec..29ff6065dda 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,43 +19,115 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size and
- * cannot grow beyond that. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -70,10 +142,80 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructOpts *options;
+ ShmemRequestKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -105,35 +247,379 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (128)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
+static void InitShmemIndexEntry(ShmemRequest *request);
+static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
/*
- * A very simple allocator used to carve out different parts of a hash table
- * from a previously allocated contiguous shared memory area.
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemRequest, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->options->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+ size = CACHELINEALIGN(size);
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ /* calculate alignment padding like ShmemAllocRaw() does */
+ size = CACHELINEALIGN(size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ InitShmemIndexEntry(request);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachShmemIndexEntry(request, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Insert requested shmem area into the shared memory index and initialize it.
+ *
+ * Note that this only does performs basic initialization depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific init callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
*/
-typedef struct shmem_hash_allocator
+static void
+InitShmemIndexEntry(ShmemRequest *request)
{
- char *next; /* start of free space in the area */
- char *end; /* end of the shmem area */
-} shmem_hash_allocator;
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+ bool found;
+ size_t allocated_size;
+ void *structPtr;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_ENTER_NULL, &found);
+ if (found)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", name);
+ if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ name)));
+ }
+
+ /*
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of shared memory for it, and initialize the index entry.
+ */
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(structPtr, request->options);
+ break;
+ }
+}
+
+/*
+ * Look up a named shmem area in the shared memory index and attach to it.
+ *
+ * Note that this only performs the basic attachment actions depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific attach callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
+ */
+static bool
+AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
+{
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, NULL);
+ if (!index_entry)
+ {
+ if (!missing_ok)
+ ereport(ERROR,
+ (errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ request->options->name)));
+ return false;
+ }
+
+ /* Check that the size in the index matches the request. */
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different size: existing %zu, requested %zu",
+ name, index_entry->size, request->options->size)));
+ }
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(index_entry->location, request->options);
+ break;
+ }
+
+ return true;
+}
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_nelems;
HASHCTL info;
int hash_flags;
@@ -142,6 +628,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -186,19 +682,21 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_nelems = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_FIXED_SIZE;
if (!IsUnderPostmaster)
{
- ShmemAllocator->index_size = hash_estimate_size(SHMEM_INDEX_SIZE, info.entrysize);
+ ShmemAllocator->index_size = hash_estimate_size(hash_nelems, info.entrysize);
ShmemAllocator->index = (HASHHDR *) ShmemAlloc(ShmemAllocator->index_size);
}
ShmemIndex = shmem_hash_create(ShmemAllocator->index,
ShmemAllocator->index_size,
IsUnderPostmaster,
- "ShmemIndex", SHMEM_INDEX_SIZE,
+ "ShmemIndex", hash_nelems,
&info, hash_flags);
Assert(ShmemIndex != NULL);
@@ -219,6 +717,23 @@ InitShmemAllocator(PGShmemHeader *seghdr)
}
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -316,92 +831,191 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /*
+ * After-startup initialization or attachment. Call the appropriate
+ * callbacks immmediately.
+ */
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- Assert(ShmemIndex != NULL);
+ CallShmemCallbacksAfterStartup(callbacks);
+ }
+ else
+ {
+ /* Remember the callbacks for later */
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+ }
+}
+
+/*
+ * Register a shmem area (or multiple areas) after startup.
+ */
+static void
+CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
+{
+ bool found_any;
+ bool notfound_any;
+
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
+
+ /*
+ * Call the request callback first. The callback make ShmemRequest*()
+ * calls for each shmem area, adding them to pending_shmem_requests.
+ */
+ Assert(pending_shmem_requests == NIL);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
+
+ if (pending_shmem_requests == NIL)
+ {
+ shmem_request_state = SRS_DONE;
+ return;
+ }
+ /* Hold ShmemIndexLock while we allocate all the shmem entries */
LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ /*
+ * Check if the requested shared memory areas have already been
+ * initialized. We assume all the areas requested by the request callback
+ * to form a coherent unit such that they're all already initialized or
+ * none. Otherwise it would be ambiguous which callback, init or attach,
+ * to callback afterwards.
+ */
+ found_any = notfound_any = false;
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ if (hash_search(ShmemIndex, request->options->name, HASH_FIND, NULL))
+ found_any = true;
+ else
+ notfound_any = true;
+ }
+ if (found_any && notfound_any)
+ elog(ERROR, "found some but not all");
- if (!result)
+ /*
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
+ if (found_any)
+ AttachShmemIndexEntry(request, false);
+ else
+ InitShmemIndexEntry(request);
}
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- if (*foundPtr)
+ /* Finish by calling the appropriate subsystem-specific callback */
+ if (found_any)
{
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
}
else
{
- Size allocated_size;
-
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+}
- Assert(ShmemAddrIsValid(structPtr));
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
+
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ void *ptr = NULL;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ .ptr = &ptr,
+ };
+ ShmemRequest request = {&options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /*
+ * During postmaster startup, look up the existing entry if any.
+ */
+ *foundPtr = false;
+ if (IsUnderPostmaster)
+ *foundPtr = AttachShmemIndexEntry(&request, true);
+
+ /* Initialize it if not found */
+ if (!*foundPtr)
+ InitShmemIndexEntry(&request);
+
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(ptr != NULL);
+ return ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 0b05730129e..ab30461f247 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -21,9 +21,81 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
+#include "utils/memutils.h"
+
+/*
+ * A very simple allocator used to carve out different parts of a hash table
+ * from a previously allocated contiguous shared memory area.
+ */
+typedef struct shmem_hash_allocator
+{
+ char *next; /* start of free space in the area */
+ char *end; /* end of the shmem area */
+} shmem_hash_allocator;
static void *ShmemHashAlloc(Size size, void *alloc_arg);
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_estimate_size(options_copy->nelems,
+ options_copy->hash_info.entrysize);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ options->hash_info.hctl = location;
+ htab = shmem_hash_create(location, options->base.size, false,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
+void
+shmem_hash_attach(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = location;
+ Assert(options->hash_info.hctl != NULL);
+ htab = shmem_hash_create(location, options->base.size, true,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -40,9 +112,8 @@ static void *ShmemHashAlloc(Size size, void *alloc_arg);
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
@@ -56,7 +127,14 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
size = hash_estimate_size(nelems, infoP->entrysize);
- /* look it up in the shmem index or allocate */
+ /*
+ * Look it up in the shmem index or allocate.
+ *
+ * NOTE: The area is requested internally as SHMEM_KIND_STRUCT instead of
+ * SHMEM_KIND_HASH. That's correct because we do the hash table
+ * initialization by calling shmem_hash_create() ourselves. (We don't
+ * expose the request kind to users; if we did, that would be confusing.)
+ */
location = ShmemInitStruct(name, size, &found);
return shmem_hash_create(location, size, found,
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..93851269e43 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -67,6 +67,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinval.h"
#include "storage/standby.h"
#include "tcop/backend_startup.h"
@@ -4155,7 +4156,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 82f5403c952..147a6915f7e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -3,6 +3,11 @@
* shmem.h
* shared memory management structures
*
+ * This file contains public functions for other core subsystems and
+ * extensions to allocate shared memory. Internal functions for the shmem
+ * allocator itself and hooking it to the rest of the system are in
+ * shmem_internal.h
+ *
* Historical note:
* A long time ago, Postgres' shared memory region was allowed to be mapped
* at a different address in each process, and shared memory "pointers" were
@@ -23,43 +28,165 @@
#include "utils/hsearch.h"
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
-/* shmem.c */
-typedef struct PGShmemHeader PGShmemHeader; /* avoid including
- * storage/pg_shmem.h here */
-extern void InitShmemAllocator(PGShmemHeader *seghdr);
-extern void *ShmemAlloc(Size size);
-extern void *ShmemAllocNoError(Size size);
-extern void *ShmemHashAlloc(Size size, void *alloc_arg);
+ /*
+ * Requested size of the shmem allocation.
+ *
+ * When attaching to an existing allocation, the size must match the size
+ * given when the shmem region was allocated. This cross-check can be
+ * disabled specifying SHMEM_ATTACH_UNKNOWN_SIZE.
+ */
+ ssize_t size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * 'nelems' is the max number of elements for the hash table.
+ */
+ int64 nelems;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * ALLOW_AFTER_STARTUP: Allow these shared memory usages to be registered
+ * after postmaster startup. Normally, registering a shared memory system
+ * after postmaster startup is not allowed e.g. in an add-in library loaded
+ * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
+ * called immediately after registration, to initialize or attach to the
+ * requested shared memory areas. This is not used by any built-in
+ * subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(...) \
+ ShmemRequestStructWithOpts(&(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(...) \
+ ShmemRequestHashWithOpts(&(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(const ShmemHashOpts *options);
+
+/* legacy shmem allocation functions */
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern void *ShmemAlloc(Size size);
+extern void *ShmemAllocNoError(Size size);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
-/* shmem_hash.c */
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
-extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
new file mode 100644
index 00000000000..fe12bf33439
--- /dev/null
+++ b/src/include/storage/shmem_internal.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_internal.h
+ * Internal functions related to shmem allocation
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/shmem_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SHMEM_INTERNAL_H
+#define SHMEM_INTERNAL_H
+
+#include "storage/shmem.h"
+#include "utils/hsearch.h"
+
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemRequestKind;
+
+/* shmem.c */
+typedef struct PGShmemHeader PGShmemHeader; /* avoid including
+ * storage/pg_shmem.h here */
+extern void ShmemCallRequestCallbacks(void);
+extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
+extern void ResetShmemAllocator(void);
+
+extern void ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
+extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+
+/* shmem_hash.c */
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void shmem_hash_init(void *location, ShmemStructOpts *options);
+extern void shmem_hash_attach(void *location, ShmemStructOpts *options);
+
+#endif /* SHMEM_INTERNAL_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c72f6c59573..b84167741fb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2863,9 +2863,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemRequestKind
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.34.1
[text/x-patch] v20260405-0003-Add-test-module-to-test-after-startup-shme.patch (10.1K, 4-v20260405-0003-Add-test-module-to-test-after-startup-shme.patch)
download | inline diff:
From b185ee67f0e9108edb418edeb14d05da21f6a689 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:10:31 +0300
Subject: [PATCH v20260405 03/15] Add test module to test after-startup shmem
allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 +++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 +++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 101 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 222 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..f1b04c99969 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -48,6 +48,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..fc99552d9ab 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -49,6 +49,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..9bd4012b435
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,101 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(.name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem);
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b84167741fb..63c0b3a9465 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.34.1
[text/x-patch] v20260405-0005-Introduce-registry-of-built-in-subsystems.patch (7.3K, 5-v20260405-0005-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From a2f03667d6dee4a12631d122dc96440a0472320e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:02 +0300
Subject: [PATCH v20260405 05/15] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
src/tools/pginclude/headerscheck | 1 +
9 files changed, 88 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c707ccfa563..49d88a1b6dd 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -363,6 +363,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 0973010b7dc..ed0f4f2d234 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -664,6 +664,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 693475014fe..b2010bce186 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -922,6 +922,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 24422a80ab3..e4a6a52f12d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -252,6 +253,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 93851269e43..6a9ff3ad225 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4138,6 +4138,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 14c466cc237..24f7416185e 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -131,6 +131,7 @@ do
test "$f" = src/include/postmaster/proctypelist.h && continue
test "$f" = src/include/regex/regerrs.h && continue
test "$f" = src/include/storage/lwlocklist.h && continue
+ test "$f" = src/include/storage/subsystemlist.h && continue
test "$f" = src/include/tcop/cmdtaglist.h && continue
test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue
test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue
--
2.34.1
[text/x-patch] v20260405-0004-Convert-pg_stat_statements-to-use-the-new-.patch (11.3K, 6-v20260405-0004-Convert-pg_stat_statements-to-use-the-new-.patch)
download | inline diff:
From 36ec5df07fc74329a6cfc16252245b88e47b094c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:24 +0300
Subject: [PATCH v20260405 04/15] Convert pg_stat_statements to use the new
interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 173 ++++++++----------
1 file changed, 77 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5494d41dca1..f078b4fe71b 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,42 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ ShmemRequestHash(.name = "pg_stat_statements hash",
+ .nelems = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ .ptr = &pgss_hash,
+ );
+ ShmemRequestStruct(.name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +544,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1334,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1355,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1374,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1514,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1803,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2041,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2081,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2711,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2805,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.34.1
[text/x-patch] v20260405-0008-refactor-predicate.c-inline-SerialInit-to-.patch (3.6K, 7-v20260405-0008-refactor-predicate.c-inline-SerialInit-to-.patch)
download | inline diff:
From ebbbd773993ed6c04d1be6fa7750abac3b118f6d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH v20260405 08/15] refactor predicate.c: inline SerialInit to
the caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e003fa5b107..13a6a4b93a6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1355,7 +1312,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.34.1
[text/x-patch] v20260405-0006-Convert-lwlock.c-to-use-the-new-interface.patch (6.4K, 8-v20260405-0006-Convert-lwlock.c-to-use-the-new-interface.patch)
download | inline diff:
From ea55eca0d105597b08d9a77f1055bbb088972515 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:18:05 +0300
Subject: [PATCH v20260405 06/15] Convert lwlock.c to use the new interface
It seems like a good candidate to convert first because it needs to
initialized before any other subsystem, but other than that it's
nothing special.
---
src/backend/storage/ipc/ipci.c | 13 ------
src/backend/storage/lmgr/lwlock.c | 71 +++++++++++++++--------------
src/include/storage/lwlock.h | 2 -
src/include/storage/subsystemlist.h | 9 +++-
4 files changed, 45 insertions(+), 50 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index e4a6a52f12d..de65a9ef33c 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -121,7 +121,6 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, LWLockShmemSize());
size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, SharedInvalShmemSize());
@@ -179,11 +178,6 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
- /*
- * Attach to LWLocks first. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
CreateOrAttachShmemStructs();
@@ -230,13 +224,6 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /*
- * Initialize LWLocks first, in case any of the shmem init function use
- * LWLocks. (Nothing else can be running during startup, so they don't
- * need to do any locking yet, but we nevertheless allow it.)
- */
- LWLockShmemInit();
-
/* Initialize all shmem areas */
ShmemInitRequested();
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 5cb696490d6..30b715ab051 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -84,6 +84,7 @@
#include "storage/proclist.h"
#include "storage/procnumber.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -212,6 +213,15 @@ typedef struct NamedLWLockTrancheRequest
static List *NamedLWLockTrancheRequests = NIL;
+static void LWLockShmemRequest(void *arg);
+static void LWLockShmemInit(void *arg);
+
+const ShmemCallbacks LWLockCallbacks = {
+ .request_fn = LWLockShmemRequest,
+ .init_fn = LWLockShmemInit,
+};
+
+
static void InitializeLWLocks(int numLocks);
static inline void LWLockReportWaitStart(LWLock *lock);
static inline void LWLockReportWaitEnd(void);
@@ -401,58 +411,51 @@ NumLWLocksForNamedTranches(void)
}
/*
- * Compute shmem space needed for user-defined tranches and the main LWLock
- * array.
+ * Request shmem space for user-defined tranches and the main LWLock array.
*/
-Size
-LWLockShmemSize(void)
+static void
+LWLockShmemRequest(void *arg)
{
- Size size;
int numLocks;
+ Size size;
+
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
/* Space for user-defined tranches */
size = sizeof(LWLockTrancheShmemData);
-
- /* Space for the LWLock array */
- numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
size = add_size(size, mul_size(numLocks, sizeof(LWLockPadded)));
+ ShmemRequestStruct(.name = "LWLock tranches",
+ .size = size,
+ .ptr = (void **) &LWLockTranches,
+ );
- return size;
+ /* Space for the LWLock array */
+ ShmemRequestStruct(.name = "Main LWLock array",
+ .size = numLocks * sizeof(LWLockPadded),
+ .ptr = (void **) &MainLWLockArray,
+ );
}
/*
- * Allocate shmem space for user-defined tranches and the main LWLock array,
- * and initialize it.
+ * Initialize shmem space for user-defined tranches and the main LWLock array.
*/
-void
-LWLockShmemInit(void)
+static void
+LWLockShmemInit(void *arg)
{
int numLocks;
- bool found;
- LWLockTranches = (LWLockTrancheShmemData *)
- ShmemInitStruct("LWLock tranches", sizeof(LWLockTrancheShmemData), &found);
- if (!found)
- {
- /* Calculate total number of locks needed in the main array */
- LWLockTranches->num_main_array_locks =
- NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
- /* Initialize the dynamic-allocation counter for tranches */
- LWLockTranches->num_user_defined = 0;
+ /* Remember total number of locks needed in the main array */
+ LWLockTranches->num_main_array_locks = numLocks;
- SpinLockInit(&LWLockTranches->lock);
- }
+ /* Initialize the dynamic-allocation counter for tranches */
+ LWLockTranches->num_user_defined = 0;
- /* Allocate and initialize the main array */
- numLocks = LWLockTranches->num_main_array_locks;
- MainLWLockArray = (LWLockPadded *)
- ShmemInitStruct("Main LWLock array", numLocks * sizeof(LWLockPadded), &found);
- if (!found)
- {
- /* Initialize all LWLocks */
- InitializeLWLocks(numLocks);
- }
+ SpinLockInit(&LWLockTranches->lock);
+
+ /* Allocate and initialize all LWLocks in the main array */
+ InitializeLWLocks(numLocks);
}
/*
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 61f0dbe749a..efa5b427e9f 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -126,8 +126,6 @@ extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
-extern Size LWLockShmemSize(void);
-extern void LWLockShmemInit(void);
extern void InitLWLockAccess(void);
extern const char *GetLWLockIdentifier(uint32 classId, uint16 eventId);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..f0cf01f5a85 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,11 @@
* of these matter.
*/
-/* TODO: empty for now */
+/*
+ * LWLocks first, in case any of the other shmem init functions use LWLocks.
+ * (Nothing else can be running during startup, so they don't need to do any
+ * locking yet, but we nevertheless allow it.)
+ */
+PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
+
+/* TODO: nothing else for now */
--
2.34.1
[text/x-patch] v20260405-0009-refactor-predicate.c-Move-all-the-initiali.patch (8.3K, 9-v20260405-0009-refactor-predicate.c-Move-all-the-initiali.patch)
download | inline diff:
From 51ffe05a437c83f0a019372eaa85f9d863996ec3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH v20260405 09/15] refactor predicate.c: Move all the
initialization together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 13a6a4b93a6..af03071a71f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1144,19 +1144,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1200,49 +1187,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1278,23 +1222,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1305,8 +1232,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1328,19 +1253,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.34.1
[text/x-patch] v20260405-0007-Use-the-new-mechanism-in-a-few-core-subsys.patch (46.4K, 10-v20260405-0007-Use-the-new-mechanism-in-a-few-core-subsys.patch)
download | inline diff:
From 335d0842ca9f456649f0464a45ebd185f3c8ca4a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:21:17 +0300
Subject: [PATCH v20260405 07/15] Use the new mechanism in a few core
subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 35 ++---
src/backend/port/posix_sema.c | 22 ++-
src/backend/port/sysv_sema.c | 21 ++-
src/backend/port/win32_sema.c | 11 +-
src/backend/storage/ipc/dsm.c | 64 +++++----
src/backend/storage/ipc/dsm_registry.c | 36 ++---
src/backend/storage/ipc/ipci.c | 28 ----
src/backend/storage/ipc/latch.c | 8 +-
src/backend/storage/ipc/pmsignal.c | 51 ++++---
src/backend/storage/ipc/procarray.c | 110 +++++++-------
src/backend/storage/ipc/procsignal.c | 64 ++++-----
src/backend/storage/ipc/sinvaladt.c | 38 ++---
src/backend/storage/lmgr/proc.c | 191 +++++++++++++------------
src/backend/utils/hash/dynahash.c | 3 +-
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pg_sema.h | 6 +-
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 17 ++-
25 files changed, 344 insertions(+), 381 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..dc5e32d86f3 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,25 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 40205b7d400..53e4a7a5c38 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -159,22 +159,24 @@ PosixSemaphoreKill(sem_t *sem)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
#ifdef USE_NAMED_POSIX_SEMAPHORES
/* No shared memory needed in this case */
- return 0;
#else
/* Need a PGSemaphoreData per semaphore */
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
#endif
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -193,10 +195,9 @@ PGSemaphoreShmemSize(int maxSemas)
* we don't have to expose the counters to other processes.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -214,11 +215,6 @@ PGReserveSemaphores(int maxSemas)
mySemPointers = (sem_t **) malloc(maxSemas * sizeof(sem_t *));
if (mySemPointers == NULL)
elog(PANIC, "out of memory");
-#else
-
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
#endif
numSems = 0;
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 4b2bf84072f..98d99515043 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -301,16 +301,20 @@ IpcSemaphoreCreate(int numSems)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ /* Need a PGSemaphoreData per semaphore */
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -327,10 +331,9 @@ PGSemaphoreShmemSize(int maxSemas)
* have clobbered.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -344,10 +347,6 @@ PGReserveSemaphores(int maxSemas)
errmsg("could not stat data directory \"%s\": %m",
DataDir)));
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
-
numSharedSemas = 0;
maxSharedSemas = maxSemas;
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index ba97c9b2d64..a3202554769 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -25,17 +25,16 @@ static void ReleaseSemaphores(int code, Datum arg);
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
/* No shared memory needed on Windows */
- return 0;
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* In the Win32 implementation, we acquire semaphores on-demand; the
* maxSemas parameter is just used to size the array that keeps track of
@@ -44,7 +43,7 @@ PGSemaphoreShmemSize(int maxSemas)
* process exits.
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
mySemSet = (HANDLE *) malloc(maxSemas * sizeof(HANDLE));
if (mySemSet == NULL)
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..8b69df4ff26 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -109,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static size_t dsm_main_space_size;
+
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
/*
* List of dynamic shared memory segments used by this backend.
@@ -464,42 +474,40 @@ dsm_set_control_handle(dsm_handle h)
#endif
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ dsm_main_space_size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (dsm_main_space_size == 0)
+ return;
+
+ ShmemRequestStruct(.name = "Preallocated DSM",
+ .size = dsm_main_space_size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
- if (size == 0)
+ if (dsm_main_space_size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (dsm_main_space_size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..2b56977659b 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,20 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRequestStruct(.name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index de65a9ef33c..4f707158303 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -42,16 +41,11 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
-#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -105,14 +99,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -121,11 +111,7 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -278,13 +264,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -307,23 +289,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 8537e9fef2d..7d4f4cf32bb 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -80,10 +80,10 @@ InitLatch(Latch *latch)
* current process.
*
* InitSharedLatch needs to be called in postmaster before forking child
- * processes, usually right after allocating the shared memory block
- * containing the latch with ShmemInitStruct. (The Unix implementation
- * doesn't actually require that, but the Windows one does.) Because of
- * this restriction, we have no concurrency issues to worry about here.
+ * processes, usually right after initializing the shared memory block
+ * containing the latch. (The Unix implementation doesn't actually require
+ * that, but the Windows one does.) Because of this restriction, we have no
+ * concurrency issues to worry about here.
*
* Note that other handles created in this module are never marked as
* inheritable. Thus we do not need to worry about cleaning up child
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..bdad5fdd043 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
+ size_t size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(.name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..f540bb6b23f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,11 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
static bool *KnownAssignedXidsValid;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +387,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +412,49 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(.name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(.name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(.name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index f1ab3aa3fe0..adebf0e7898 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -106,7 +107,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -114,51 +124,39 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(.name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..37a21ffaf1a 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
Size size;
@@ -223,26 +233,18 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(.name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..a05c55b534e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,23 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *AllProcsShmemPtr;
+static void *FastPathLockArrayShmemPtr;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static uint32 TotalProcs;
+static size_t ProcGlobalAllProcsShmemSize;
+static size_t FastPathLockArrayShmemSize;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -83,32 +98,12 @@ static DeadLockState CheckDeadLock(void);
/*
- * Report shared-memory space needed by PGPROC.
+ * Calculate shared-memory space needed by Fast-Path locks.
*/
static Size
-PGProcShmemSize(void)
+CalculateFastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
-/*
- * Report shared-memory space needed by Fast-Path locks.
- */
-static Size
-FastPathLockShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +123,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +136,67 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ProcGlobalAllProcsShmemSize = size;
+ ShmemRequestStruct(.name = "PGPROC structures",
+ .size = ProcGlobalAllProcsShmemSize,
+ .ptr = &AllProcsShmemPtr,
+ );
+
+ if (!IsUnderPostmaster)
+ size = FastPathLockArrayShmemSize = CalculateFastPathLockShmemSize();
+ else
+ size = SHMEM_ATTACH_UNKNOWN_SIZE;
+ ShmemRequestStruct(.name = "Fast-Path Lock Array",
+ .size = size,
+ .ptr = &FastPathLockArrayShmemPtr,
+ );
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ ShmemRequestStruct(.name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .ptr = (void **) &ProcGlobal,
+ );
+
+ /* Let the semaphore implementation register its shared memory needs */
+ PGSemaphoreShmemRequest(ProcGlobalSemas());
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +215,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +244,11 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ ptr = AllProcsShmemPtr;
+ requestSize = ProcGlobalAllProcsShmemSize;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +257,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,30 +272,26 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemPtr;
+ requestSize = FastPathLockArrayShmemSize;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
+ /* Initialize semaphores */
+ PGSemaphoreInit(ProcGlobalSemas());
for (i = 0; i < TotalProcs; i++)
{
@@ -405,7 +412,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +467,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +600,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +669,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index d49a7a92c64..81199edca86 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -338,7 +338,8 @@ string_compare(const char *key1, const char *key2, Size keysize)
* under info->hcxt rather than under TopMemoryContext; the default
* behavior is only suitable for session-lifespan hash tables.
* Other flags bits are special-purpose and seldom used, except for those
- * associated with shared-memory hash tables, for which see ShmemInitHash().
+ * associated with shared-memory hash tables, for which see
+ * ShmemRequestHash().
*
* Fields in *info are read only when the associated flags bit is set.
* It is not necessary to initialize other fields of *info.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 66facc6907a..fe50ee505ba 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -37,11 +37,11 @@ typedef HANDLE PGSemaphore;
#endif
-/* Report amount of shared memory needed */
-extern Size PGSemaphoreShmemSize(int maxSemas);
+/* Request shared memory needed for semaphores */
+extern void PGSemaphoreShmemRequest(int maxSemas);
/* Module initialization (called during postmaster start or shmem reinit) */
-extern void PGReserveSemaphores(int maxSemas);
+extern void PGSemaphoreInit(int maxSemas);
/* Allocate a PGSemaphore structure with initial count 1 */
extern PGSemaphore PGSemaphoreCreate(void);
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 22822fc68d7..3e1d1fad5f9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,8 +552,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index cc4f26aa33d..7f855971b5a 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -67,9 +67,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index f0cf01f5a85..d62c29f1361 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -27,4 +27,19 @@
*/
PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
-/* TODO: nothing else for now */
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405-0010-Convert-SLRUs-to-use-the-new-interface.patch (84.8K, 11-v20260405-0010-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From f0ed619253b094e5ffbde7cd62d3dd060784a445 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:32:45 +0300
Subject: [PATCH v20260405 10/15] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 85 +++---
src/backend/access/transam/multixact.c | 138 +++++----
src/backend/access/transam/slru.c | 366 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 115 ++++---
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 266 +++++++---------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 112 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem_internal.h | 1 +
src/include/storage/subsystemlist.h | 10 +
src/test/modules/test_slru/test_slru.c | 106 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 691 insertions(+), 664 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..7cd1a56201f 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(.desc = &XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..2625cbf93bf 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,7 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +126,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +522,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +548,36 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(.desc = &CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
+ .nslots = CommitTsShmemBuffers(),
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
+
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
+
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..62d58da4abc 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,81 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(.name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(.desc = &MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(.desc = &MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..47dd52d6749 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -70,7 +70,9 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +91,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +103,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +112,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +178,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +198,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +240,135 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = (SlruShared) location;
+ desc->shared = shared;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) location;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +394,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +447,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +463,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +489,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +547,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +651,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +698,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +778,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +792,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +850,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +922,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1054,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1094,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1109,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1124,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1137,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1216,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1308,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1320,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1369,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1439,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1455,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1477,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1505,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1550,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1573,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1650,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1671,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1690,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1747,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1762,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1778,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1794,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1808,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1841,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1849,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1863,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1881,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..b79e648b899 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index e91a62ff42a..db6a9a6561b 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,63 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(.name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(.desc = &NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 4f707158303..7a8c69de802 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
@@ -123,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
@@ -270,10 +264,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -281,11 +271,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -313,7 +298,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 29ff6065dda..bc186d6ea17 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -134,6 +134,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -549,6 +550,9 @@ InitShmemIndexEntry(ShmemRequest *request)
case SHMEM_KIND_HASH:
shmem_hash_init(structPtr, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(structPtr, request->options);
+ break;
}
}
@@ -602,6 +606,9 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
case SHMEM_KIND_HASH:
shmem_hash_attach(index_entry->location, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(index_entry->location, request->options);
+ break;
}
return true;
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index af03071a71f..9c389b23506 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,8 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static int64 max_serializable_xacts;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +456,12 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,71 +1113,53 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
-
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
-
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ ShmemRequestHash(.name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "PREDICATELOCK hash",
+ .nelems = max_predicate_locks,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1177,29 +1172,27 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
-
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "SERIALIZABLEXID hash",
+ .nelems = max_serializable_xacts,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1214,58 +1207,50 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(.name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1277,7 +1262,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1319,6 +1303,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1335,57 +1322,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..36a7514d7a0 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,28 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruDesc SlruDesc;
+
+typedef struct SlruOpts
{
- SlruShared shared;
+ ShmemStructOpts base;
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ /*
+ * name of SLRU. (This is user-visible, pick with care!)
+ */
+ const char *name;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * Pointer to a backend-private handle for the SLRU. It is initialized in
+ * when the SLRU is initialized or attached to.
*/
- bool long_segment_names;
+ SlruDesc *desc;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +136,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +172,26 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
-typedef SlruCtlData *SlruCtl;
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ SlruOpts options;
+
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +200,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(const SlruOpts *options);
+
+#define SimpleLruRequest(...) \
+ SimpleLruRequestWithOpts(&(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(void *location, ShmemStructOpts *options);
+extern void shmem_slru_attach(void *location, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
index fe12bf33439..7b259d33ccf 100644
--- a/src/include/storage/shmem_internal.h
+++ b/src/include/storage/shmem_internal.h
@@ -21,6 +21,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemRequestKind;
/* shmem.c */
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d62c29f1361..c199f18a27a 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -43,3 +50,6 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..40efffdbf62 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,60 +212,46 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
+void
+_PG_init(void)
{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
+ if (!process_shared_preload_libraries_in_progress)
+ ereport(ERROR,
+ (errmsg("cannot load \"%s\" after startup", "test_slru"),
+ errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
+ "test_slru")));
/*
* Create the SLRU directory if it does not exist yet, from the root of
* the data directory.
*/
- (void) MakePGDirectory(slru_dir_name);
+ (void) MakePGDirectory(TestSlruDir);
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
}
-void
-_PG_init(void)
+static void
+test_slru_shmem_request(void *arg)
{
- if (!process_shared_preload_libraries_in_progress)
- ereport(ERROR,
- (errmsg("cannot load \"%s\" after startup", "test_slru"),
- errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
- "test_slru")));
+ SimpleLruRequest(.desc = &TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 63c0b3a9465..3c35097361d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2901,9 +2901,9 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
+SlruOpts
SlruPageStatus
SlruScanCallback
SlruSegState
--
2.34.1
[text/x-patch] v20260405-0011-Convert-AIO-to-the-new-interface.patch (14.6K, 12-v20260405-0011-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From a621b2a0acb2397294223d2a96ea52337d884832 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH v20260405 11/15] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/storage/aio/aio_init.c | 112 +++++++++++++---------
src/backend/storage/aio/method_io_uring.c | 39 ++++----
src/backend/storage/aio/method_worker.c | 84 +++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/include/storage/aio_internal.h | 16 +---
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 143 insertions(+), 117 deletions(-)
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..18bb4235044 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,12 +117,15 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
-
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -132,48 +143,52 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(.name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +223,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 9f76d2683c0..3295c59ed75 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,31 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +298,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index efe38e9f113..df94a434856 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ pgaio_worker_set_secondary_ptr();
+
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ size_t size;
+ int queue_size;
+
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+
+ ShmemRequestStruct(.name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7a8c69de802..a510c928daa 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -122,7 +122,6 @@ CalculateShmemSize(void)
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
size = add_size(size, DataChecksumsShmemSize());
@@ -301,7 +300,6 @@ CreateOrAttachShmemStructs(void)
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index c199f18a27a..b438794d46d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -53,3 +53,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405-0012-Add-option-for-aligning-shmem-allocations.patch (4.0K, 13-v20260405-0012-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From 2b257c8c05d2f12d00c36e0d45a180c19990bae2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH v20260405 12/15] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 26 ++++++++++++++++----------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index bc186d6ea17..973811e545e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -239,7 +239,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -400,7 +400,8 @@ ShmemGetRequestedSize(void)
{
size = add_size(size, request->options->size);
/* calculate alignment padding like ShmemAllocRaw() does */
- size = CACHELINEALIGN(size);
+ size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
+ size);
}
return size;
@@ -525,7 +526,9 @@ InitShmemIndexEntry(ShmemRequest *request)
* We inserted the entry to the shared memory index. Allocate requested
* amount of shared memory for it, and initialize the index entry.
*/
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size,
+ request->options->alignment,
+ &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -754,7 +757,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -773,7 +776,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -783,8 +786,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -800,14 +804,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -821,8 +826,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 147a6915f7e..91218db6d6e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -51,6 +51,12 @@ typedef struct ShmemStructOpts
*/
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
--
2.34.1
[text/x-patch] v20260405-0013-Convert-buffer-manager-to-the-new-API.patch (15.6K, 14-v20260405-0013-Convert-buffer-manager-to-the-new-API.patch)
download | inline diff:
From 5709c7a77a1ab3692fefa6d4c0c0aaf2b4b7609d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:44:02 +0300
Subject: [PATCH v20260405 13/15] Convert buffer manager to the new API
---
src/backend/storage/buffer/buf_init.c | 149 ++++++++++---------------
src/backend/storage/buffer/buf_table.c | 54 +++++----
src/backend/storage/buffer/freelist.c | 93 +++++----------
src/backend/storage/ipc/ipci.c | 3 -
src/include/storage/buf_internals.h | 5 -
src/include/storage/bufmgr.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 124 insertions(+), 187 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..1407c930c56 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,31 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ ShmemRequestStruct(.name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(.name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(.name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +104,50 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(.name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index d04ef74b850..347bf267d73 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -22,6 +22,7 @@
#include "postgres.h"
#include "storage/buf_internals.h"
+#include "storage/subsystems.h"
/* entry for buffer lookup hashtable */
typedef struct
@@ -32,37 +33,42 @@ typedef struct
static HTAB *SharedBufHash;
+static void BufTableShmemRequest(void *arg);
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
+const ShmemCallbacks BufTableShmemCallbacks = {
+ .request_fn = BufTableShmemRequest,
+ /* no special initialization needed, the hash table will start empty */
+};
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(void *arg)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ int size;
+
+ /*
+ * Request the shared buffer lookup hashtable.
+ *
+ * Since we can't tolerate running out of lookup table entries, we must be
+ * sure to specify an adequate table size here. The maximum steady-state
+ * usage is of course NBuffers entries, but BufferAlloc() tries to insert
+ * a new entry before deleting the old. In principle this could be
+ * happening in each partition concurrently, so we could need as many as
+ * NBuffers + NUM_BUFFER_PARTITIONS entries.
+ */
+ size = NBuffers + NUM_BUFFER_PARTITIONS;
+
+ ShmemRequestHash(.name = "Shared Buffer Lookup Table",
+ .nelems = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..fdb5bad7910 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,80 +379,35 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- request shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
-
- return size;
+ ShmemRequestStruct(.name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
}
/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
*/
-void
-StrategyInitialize(bool init)
+static void
+StrategyCtlShmemInit(void *arg)
{
- bool found;
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /*
- * Initialize the shared buffer lookup hashtable.
- *
- * Since we can't tolerate running out of lookup table entries, we must be
- * sure to specify an adequate table size here. The maximum steady-state
- * usage is of course NBuffers entries, but BufferAlloc() tries to insert
- * a new entry before deleting the old. In principle this could be
- * happening in each partition concurrently, so we could need as many as
- * NBuffers + NUM_BUFFER_PARTITIONS entries.
- */
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
-
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
-
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a510c928daa..f64c1d59fa3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -39,7 +39,6 @@
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
@@ -99,7 +98,6 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
@@ -263,7 +261,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- BufferManagerShmemInit();
/*
* Set up lock manager
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..89615a254a3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,7 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index b438794d46d..d8e11756a61 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -36,6 +36,9 @@ PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405-0014-Convert-all-remaining-subsystems-to-use-th.patch (110.5K, 15-v20260405-0014-Convert-all-remaining-subsystems-to-use-th.patch)
download | inline diff:
From bd06717034dc254e81048d679b444fb596fceb00 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH v20260405 14/15] Convert all remaining subsystems to use the
new API
---
src/backend/access/common/syncscan.c | 76 ++++----
src/backend/access/nbtree/nbtutils.c | 54 +++---
src/backend/access/transam/twophase.c | 75 ++++----
src/backend/access/transam/xlog.c | 82 +++++----
src/backend/access/transam/xlogprefetcher.c | 51 +++---
src/backend/access/transam/xlogrecovery.c | 35 ++--
src/backend/access/transam/xlogwait.c | 50 ++---
src/backend/postmaster/autovacuum.c | 79 ++++----
src/backend/postmaster/bgworker.c | 105 +++++------
src/backend/postmaster/checkpointer.c | 56 +++---
src/backend/postmaster/datachecksum_state.c | 41 ++---
src/backend/postmaster/pgarch.c | 43 +++--
src/backend/postmaster/walsummarizer.c | 60 +++---
src/backend/replication/logical/launcher.c | 56 +++---
src/backend/replication/logical/logicalctl.c | 29 ++-
src/backend/replication/logical/origin.c | 59 +++---
src/backend/replication/logical/slotsync.c | 41 +++--
src/backend/replication/slot.c | 64 +++----
src/backend/replication/walreceiverfuncs.c | 51 +++---
src/backend/replication/walsender.c | 59 +++---
src/backend/storage/ipc/ipci.c | 124 +------------
src/backend/storage/lmgr/lock.c | 113 +++++-------
src/backend/utils/activity/backend_status.c | 173 +++++++-----------
src/backend/utils/activity/pgstat_shmem.c | 158 ++++++++--------
src/backend/utils/activity/wait_event.c | 83 ++++-----
src/backend/utils/misc/injection_point.c | 57 +++---
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/datachecksum_state.h | 4 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 27 +++
src/include/utils/backend_status.h | 8 -
src/include/utils/injection_point.h | 3 -
src/include/utils/wait_event.h | 2 -
.../injection_points/injection_points.c | 59 ++----
src/test/modules/test_aio/test_aio.c | 107 +++++------
54 files changed, 933 insertions(+), 1206 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..0f9eb167bed 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,47 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ ShmemRequestStruct(.name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..014faa1622f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,37 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(.name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..836928180a9 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,10 +243,10 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
Size size;
@@ -248,46 +257,40 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(.name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e8999bbb61..bbc565509b0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -96,6 +96,7 @@
#include "storage/procsignal.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -571,6 +572,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -579,6 +590,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -5257,7 +5269,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
SetLocalDataChecksumState(ControlFile->data_checksum_version);
}
@@ -5274,10 +5287,10 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
Size size;
@@ -5317,23 +5330,24 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(.name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5351,36 +5365,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5442,6 +5437,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..83a3f97a57c 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,25 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +337,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..c236e2b7969 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,20 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..2e31c0d67d7 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,34 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(.name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..250c43b85e5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3545,11 +3554,11 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
Size size;
@@ -3560,53 +3569,41 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(.name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 536aff7ca05..0992b9b6353 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -30,6 +30,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -110,6 +111,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -160,10 +169,10 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
Size size;
@@ -171,66 +180,58 @@ BackgroundWorkerShmemSize(void)
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..6b424ee610f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,11 +959,11 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
Size size;
@@ -967,39 +976,24 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(.name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index 76004bcedc6..eb7b01d0993 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -211,6 +211,7 @@
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -346,6 +347,7 @@ static volatile sig_atomic_t launcher_running = false;
static DataChecksumsWorkerOperation operation;
/* Prototypes */
+static void DataChecksumsShmemRequest(void *arg);
static bool DatabaseExists(Oid dboid);
static List *BuildDatabaseList(void);
static List *BuildRelationList(bool temp_relations, bool include_shared);
@@ -356,6 +358,10 @@ static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferA
static void launcher_cancel_handler(SIGNAL_ARGS);
static void WaitForAllTransactionsToFinish(void);
+const ShmemCallbacks DataChecksumsShmemCallbacks = {
+ .request_fn = DataChecksumsShmemRequest,
+};
+
/*****************************************************************************
* Functionality for manipulating the data checksum state in the cluster
*/
@@ -1236,35 +1242,16 @@ ProcessAllDatabases(void)
}
/*
- * DataChecksumStateSize
- * Compute required space for datachecksumsworker-related shared memory
- */
-Size
-DataChecksumsShmemSize(void)
-{
- Size size;
-
- size = sizeof(DataChecksumsStateStruct);
- size = MAXALIGN(size);
-
- return size;
-}
-
-/*
- * DataChecksumStateInit
- * Allocate and initialize datachecksumsworker-related shared memory
+ * DataChecksumShmemRequest
+ * Request datachecksumsworker-related shared memory
*/
-void
-DataChecksumsShmemInit(void)
+static void
+DataChecksumsShmemRequest(void *arg)
{
- bool found;
-
- DataChecksumState = (DataChecksumsStateStruct *)
- ShmemInitStruct("DataChecksumsWorker Data",
- DataChecksumsShmemSize(),
- &found);
- if (!found)
- MemSet(DataChecksumState, 0, DataChecksumsShmemSize());
+ ShmemRequestStruct(.name = "DataChecksumsWorker Data",
+ .size = sizeof(DataChecksumsStateStruct),
+ .ptr = (void **) &DataChecksumState,
+ );
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..0a1a1149d78 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,31 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- size = add_size(size, sizeof(PgArchData));
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
- return size;
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
}
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
{
- bool found;
-
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..20960f5b633 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,34 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ ShmemRequestStruct(.name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..9e75a3e04ee 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,11 +981,11 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
Size size;
@@ -987,7 +996,10 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(.name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1040,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..72f68ec58ef 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,13 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
-
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(.name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..372d77c475e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,48 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(.name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..d615ff8a81c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,26 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ ShmemRequestStruct(.name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..21a213a0ebf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,41 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..4e03e721872 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,46 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
-
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2bb3f34dc6d..ec39942bfc1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3765,47 +3774,37 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index f64c1d59fa3..bf6b81e621b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,41 +14,16 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/datachecksum_state.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/shmem_internal.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -57,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -97,33 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
- size = add_size(size, DataChecksumsShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -157,7 +103,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -204,9 +149,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -237,70 +179,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
- DataChecksumsShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitEventCustomShmemInit();
- InjectionPointShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 798c453ab38..68d5a0389df 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,21 +443,15 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_table_size;
- bool found;
/*
* Compute sizes for lock hashtables. Note that these calculations must
@@ -455,45 +460,48 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
/*
- * Allocate hash table for LOCK structs. This stores per-locked-object
+ * Hash table for LOCK structs. This stores per-locked-object
* information.
*/
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "LOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_FIXED_SIZE | HASH_PARTITION);
+ ShmemRequestHash(.name = "PROCLOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(.name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin. size =
+ * add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3758,29 +3766,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /* fast-path structures */
- size = add_size(size, sizeof(FastPathStrongRelationLockData));
-
- return size;
-}
-
/*
* GetLockStatusData - Return a summary of the lock manager's internal
* status, for use in a user-level reporting function.
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..4cb9c80a2c5 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,97 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(.name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(.name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(.name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
#endif
+
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
- }
-
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
- {
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +177,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +191,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..955faf5ebc7 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -149,102 +157,98 @@ StatsShmemSize(void)
return sz;
}
+/*
+ * Register shared memory area for cumulative statistics
+ */
+static void
+StatsShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
+
/*
* Initialize cumulative statistics system during startup
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemInit(void *arg)
{
- bool found;
- Size sz;
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- Assert(!found);
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ if (!kind_info)
+ continue;
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
{
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
-
- if (!kind_info)
- continue;
-
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
-
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 2b76967776c..95635c7f56c 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -95,59 +96,47 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRequestStruct(.name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..a7c99e097ea 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,45 +232,34 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
+}
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
}
+#endif
/*
* Attach a new injection point.
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4af38e74ce4..437b4f32349 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,8 +259,6 @@ extern void InitLocalDataChecksumState(void);
extern void SetLocalDataChecksumState(uint32 data_checksum_version);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/datachecksum_state.h b/src/include/postmaster/datachecksum_state.h
index 343494edcc8..05625539604 100644
--- a/src/include/postmaster/datachecksum_state.h
+++ b/src/include/postmaster/datachecksum_state.h
@@ -17,10 +17,6 @@
#include "storage/procsignal.h"
-/* Shared memory */
-extern Size DataChecksumsShmemSize(void);
-extern void DataChecksumsShmemInit(void);
-
/* Possible operations the Datachecksumsworker can perform */
typedef enum DataChecksumsWorkerOperation
{
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d8e11756a61..5e092552c72 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,9 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
@@ -40,12 +43,18 @@ PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
+
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -53,9 +62,27 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(DataChecksumsShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..0f1af513673 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,23 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(.name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +584,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..35efba1a5e3 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,8 +74,15 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
@@ -83,70 +90,55 @@ static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(.name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -155,10 +147,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.34.1
[text/x-patch] v20260405-0015-resizable-shared-memory-structures.patch (66.2K, 16-v20260405-0015-resizable-shared-memory-structures.patch)
download | inline diff:
From 77ca1ea3f27d79e8c59a98b89ee46d3dd17be2dc Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH v20260405 15/15] resizable shared memory structures
Resizable shared memory structures can be allocated by specifying a new
member ShmemStructOpts::maximum_size. At the startup or when the
structure is created, we reserve address space worth maximum_size in the
shared memory segment. It is expected that the subsystem which creates
the structure would initialize only the initial size worth of memory
when creating it. In an mmap'ed memory, this should allocate memory
worth the initial size. It should not allocate maximum_size worth of
memory initially. As the structure is resized using ShmemResizeStruct()
memory is freed or allocated in chunks of memory pages when shrinking
and expanding the structure respectively.
Resizable shared memory feature depends upon existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check existence of madvise
separately, since existence of the constants implies existence of the
function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since that's
largely used for Windows where the APIs to free and allocate memory from
and to a given address space are not known to the author right now.
Given that PostgreSQL is used widely on Linux, providing this feature on
Linux covers benefits most of its users. Once we figure out the required
Windows APIs, we will support this feature on Windows as well.
The feature is also not available when Sys-V shared memory is used even
on Linux since we do not know whether required Sys-V APIs exist; mostly
they don't. Since that combination is only available for development and
testing, not supporting the feature there isn't going to impact
PostgreSQL users.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not support
the feature. But we also have run time checks to disable this feature
when Sys-V shared memory is used. In order to know whether a given
instance of running server supports resizable structures, we have
introduced GUC have_resizable_shmem.
Author: Ashutosh Bapat <[email protected]>
---
configure.ac | 4 +
doc/src/sgml/config.sgml | 15 +
doc/src/sgml/system-views.sgml | 30 +-
doc/src/sgml/xfunc.sgml | 54 +++
meson.build | 16 +
src/backend/port/sysv_shmem.c | 69 ++++
src/backend/port/win32_shmem.c | 23 ++
src/backend/storage/ipc/shmem.c | 269 ++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/guc_tables.c | 7 +
src/include/catalog/pg_proc.dat | 4 +-
src/include/pg_config.h.in | 8 +
src/include/pg_config_manual.h | 9 +
src/include/storage/pg_shmem.h | 5 +
src/include/storage/shmem.h | 15 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 25 ++
src/test/modules/resizable_shmem/meson.build | 36 ++
.../resizable_shmem/resizable_shmem--1.0.sql | 37 ++
.../modules/resizable_shmem/resizable_shmem.c | 329 ++++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 4 +
.../resizable_shmem/t/001_resizable_shmem.pl | 235 +++++++++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 23 ++
.../modules/test_shmem/test_shmem--1.0.sql | 4 +
src/test/modules/test_shmem/test_shmem.c | 20 ++
src/test/regress/expected/rules.out | 6 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 1221 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/configure.ac b/configure.ac
index ff5dd64468e..7acd844ccb2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1895,6 +1895,10 @@ AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
+# Linux-specific madvise constants needed for resizable shared memory. See similar checks in meson.build for explanation of why these checks are here.
+AC_CHECK_DECLS([MADV_POPULATE_WRITE], [], [], [#include <sys/mman.h>])
+AC_CHECK_DECLS([MADV_REMOVE], [], [], [#include <sys/mman.h>])
+
AC_REPLACE_FUNCS(m4_normalize([
explicit_bzero
getopt
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d3fea738ca3..a42a173445e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -12072,6 +12072,21 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-have-resizable-shmem" xreflabel="have_resizable_shmem">
+ <term><varname>have_resizable_shmem</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>have_resizable_shmem</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with <literal>HAVE_RESIZABLE_SHMEM</literal> enabled and supports
+ <link linkend="xfunc-shared-addin-resizable">Resizable shared memory structures</link>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status">
<term><varname>huge_pages_status</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2ebec6928d5..0ec845fe063 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the resizable allocation can grow to. Zero for
+ fixed-size allocations, for anonymous allocations, and for free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>allocated_space</structfield> <type>int8</type>
+ </para>
+ <para>
+ Address space reserved for the allocation in bytes. For resizable
+ structures, this is the total address space reserved to accommodate
+ growth up to <structfield>maximum_size</structfield>, and is greater
+ than or equal to <structfield>allocated_size</structfield>. For
+ fixed-size allocations, anonymous allocations, and free memory this
+ is same as <structfield>allocated_size</structfield>.
</para></entry>
</row>
</tbody>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index aed3f2f0071..5e00d9134a9 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3748,6 +3748,60 @@ my_shmem_init(void *arg)
</para>
</sect3>
+ <sect3 id="xfunc-shared-addin-resizable">
+ <title>Resizable shared memory structures</title>
+
+ <para>
+ A resizable memory structure can be requested using
+ <function>ShmemRequestStruct</function> by passing
+ <parameter>.maximum_size</parameter> along with
+ <parameter>.size</parameter>. <parameter>.maximum_size</parameter> is
+ maximum size upto which the structure can grow where as
+ <parameter>.size</parameter> is the initial size of the structure. While
+ contiguous address space worth <parameter>maximum_size</parameter> is
+ allocated to the structure, only memory worth <parameter>size</parameter>
+ bytes is allocated initially. The <function>init_fn</function> should only
+ initialize the <parameter>size</parameter> amount of memory. The actual
+ memory allocated to this structure at any point in time is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_size</structfield></link>
+ and the address space reserved for this structure is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_space</structfield></link>.
+ </para>
+
+ <para>
+ The structure can be resized using <function>ShmemResizeStruct</function> by
+ passing it the structure's <structname>ShmemStructDesc</structname> and the
+ new size which can be anywhere between 0 to
+ <parameter>maximum_size</parameter>. If the new size is smaller than the
+ current size of the structure, the memory between the new size and current
+ size is freed while keeping the contents of the memory upto new size intact.
+ If the new size is greater than the current size, memory is allocated upto
+ new size while keeping the current contents of the structure intact. The
+ starting address of the structure does not change because of resizing
+ operation. The caller may need to take care of the additional
+ synchronization between the resizing process and the processes using the
+ shared structure. Also accessing the memory beyond the current size of the
+ structure will not cause any segmentation fault or a bus error. Memory will
+ be allocated during such a write access. 0s will be returned on such a read
+ access if memory is not allocated yet. The additional synchronization may
+ use mprotect() with PROT_NONE in every backend that may access this memory
+ to ensure that such an access results in a fault.
+ </para>
+
+ <para>
+ This functionality is available only on the platforms which provide the APIs
+ necessary to reserve contiguous address space and to allocate or free memory
+ in that address space on demand. Macro <symbol>HAVE_RESIZABLE_SHMEM</symbol>
+ is defined on such platforms. It can be used to guard code related to
+ resizing a shared memory structure. The functionality is available on with
+ mmap'ed memory, so subsystems which use resizable structures may have to
+ addtionally disable resizable memory usage when <symbol>shared_memory_type</symbol> is not
+ <symbol>SHMEM_TYPE_MMAP</symbol>. A GUC <xref linkend="guc-have-resizable-shmem"/> is set to
+ <literal>on</literal> when this functionality is available in a running
+ server, <literal>off</literal> otherwise.
+ </para>
+ </sect3>
+
<sect3 id="xfunc-shared-addin-dynamic">
<title>Allocating Dynamic Shared Memory After Startup</title>
diff --git a/meson.build b/meson.build
index 43d5ffc30b1..790845762e1 100644
--- a/meson.build
+++ b/meson.build
@@ -2904,6 +2904,22 @@ decl_checks = [
['timingsafe_bcmp', 'string.h'],
]
+# Linux-specific madvise constants needed for resizable shared memory.
+# Usually we use AC_CHECK_DECLS to check for function declarations, but in this
+# case we are using it to detect existence of constants. These constants are
+# used to define HAVE_RESIZABLE_SHMEM which is used in storage/pg_shmem.h as
+# well as storage/shmem.h. The first abstracts the APIs to allocate shared
+# memory segments from the operating system whereas the second abstracts APIs to
+# allocate shared memory to various subsystems. Since they are related but
+# orthogonal to each other, including any one of them in the other file doesn't
+# make sense. pg_config_manual.h is the only place where HAVE_RESIZABLE_SHMEM
+# can be defined and made available to both without including sys/mman.h. But
+# for that we need constants that indicate the existence of following defines.
+decl_checks += [
+ ['MADV_POPULATE_WRITE', 'sys/mman.h'],
+ ['MADV_REMOVE', 'sys/mman.h'],
+]
+
# Need to check for function declarations for these functions, because
# checking for library symbols wouldn't handle deployment target
# restrictions on macOS
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..8d859dfbbfb 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,51 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be freed")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not free shared memory: %m")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be allocated at runtime")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not allocate shared memory: %m")));
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..dc2ee018845 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -648,3 +648,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 973811e545e..47dcd566fd8 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,11 +19,11 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * This module provides facilities to allocate fixed-size structures in shared
- * memory, for things like variables shared between all backend processes.
- * Each such structure has a string name to identify it, specified when it is
- * requested. shmem_hash.c provides a shared hash table implementation on top
- * of that.
+ * This module provides facilities to allocate fixed-size as well as resizable
+ * structures in shared memory, for things like variables shared between all
+ * backend processes. Each such structure has a string name to identify it,
+ * specified when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of fixed-size structures.
*
* Shared memory areas should usually not be allocated after postmaster
* startup, although we do allow small allocations later for the benefit of
@@ -106,6 +106,21 @@
* (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
* additional per-backend setup.
*
+ * Resizable shared memory structures
+ * ----------------------------------
+ *
+ * In order to allocate resizable shared memory structures, set
+ * ShmemRequestStructOpts::maximum_size to the maximum size that the structure
+ * can grow to. The address space for the maximum size will be reserved at
+ * startup, but memory is allocated or freed as the structure grows or shrinks
+ * respectively. ShmemRequestStructOpts::size should be set to the initial size
+ * of the structure, which is the amount of memory allocated at the startup.
+ * After startup, the structure can be resized by calling ShmemResizeStruct() by
+ * passing it the ShmemStructDesc for the structure and the new size.
+ *
+ * While resizable structures can be created after the startup, the memory
+ * available for them is quite limited.
+ *
* Legacy ShmemInitStruct()/ShmemInitHash() functions
* --------------------------------------------------
*
@@ -170,6 +185,18 @@ typedef struct
ShmemRequestKind kind;
} ShmemRequest;
+/*
+ * A convenient macro to get the space required for a shmem request consistently.
+ * A resizable structure, requested by non-zero maximum_size, requires space for
+ * its maximum size.
+ */
+#ifdef HAVE_RESIZABLE_SHMEM
+#define SHMEM_REQUEST_SPACE_SIZE(request) \
+ ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+#else
+#define SHMEM_REQUEST_SPACE_SIZE(request) ((request)->options->size)
+#endif
+
static List *pending_shmem_requests;
/*
@@ -272,6 +299,10 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size maximum_size; /* the maximum size the structure can grow to */
+ Size allocated_space; /* the total address space allocated */
+#endif
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -280,6 +311,9 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+#ifdef HAVE_RESIZABLE_SHMEM
+static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
+#endif
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -350,6 +384,11 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
else
{
@@ -358,8 +397,24 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->maximum_size < 0)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0 && options->size >= options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
+ options->name, options->maximum_size, options->size);
+
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -379,8 +434,13 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
}
/*
- * ShmemGetRequestedSize() --- estimate the total size of all registered shared
- * memory structures.
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * When maximum_size is specified when requesting resizable shared memory
+ * structures. We use that, instead of the (initial) size, for the estimation,
+ * to ensure that enough space is reserved for growing the resizable structures
+ * to its maximum size.
*
* This is called once at postmaster startup, before the shared memory segment
* has been created.
@@ -398,7 +458,7 @@ ShmemGetRequestedSize(void)
/* memory needed for all the requested areas */
foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- size = add_size(size, request->options->size);
+ size = add_size(size, SHMEM_REQUEST_SPACE_SIZE(request));
/* calculate alignment padding like ShmemAllocRaw() does */
size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
size);
@@ -506,6 +566,7 @@ InitShmemIndexEntry(ShmemRequest *request)
ShmemIndexEnt *index_entry;
bool found;
size_t allocated_size;
+ size_t requested_size;
void *structPtr;
/* look it up in the shmem index */
@@ -523,10 +584,18 @@ InitShmemIndexEntry(ShmemRequest *request)
}
/*
- * We inserted the entry to the shared memory index. Allocate requested
- * amount of shared memory for it, and initialize the index entry.
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of address space in the shared memory segment for it, and do
+ * basic initializion. The memory gets allocated during initialization as
+ * the corresponding memory pages are written to. Allocate enough space
+ * for a resizable structure to grow to its maximum size. It is expected
+ * that the initialization callback will use only as much memory as the
+ * initial size of the resizable structure. (Well, if it doesn't, more
+ * memory will be allocated initially than expected, no further harm is
+ * done.)
*/
- structPtr = ShmemAllocRaw(request->options->size,
+ requested_size = SHMEM_REQUEST_SPACE_SIZE(request);
+ structPtr = ShmemAllocRaw(requested_size,
request->options->alignment,
&allocated_size);
if (structPtr == NULL)
@@ -535,13 +604,22 @@ InitShmemIndexEntry(ShmemRequest *request)
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
+ errmsg("not enough shared memory space for data structure"
" \"%s\" (%zu bytes requested)",
- name, request->options->size)));
+ name, requested_size)));
}
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
+#ifdef HAVE_RESIZABLE_SHMEM
+ index_entry->allocated_space = allocated_size;
+ index_entry->maximum_size = request->options->maximum_size;
+ if (request->options->maximum_size > 0)
+ {
+ /* Adjust allocated size of a resizable structure. */
+ index_entry->allocated_size = EstimateAllocatedSize(index_entry);
+ }
+#endif
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -586,7 +664,7 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the size in the index matches the request. */
+ /* Check that the sizes in the index match the request. */
if (index_entry->size != request->options->size &&
request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -596,6 +674,18 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+#endif
+
/*
* Re-establish the caller's pointer variable, or do other actions to
* attach depending on the kind of shmem area it is.
@@ -617,6 +707,115 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Estimate the actual memory allocated for a resizable structure.
+ *
+ * ... based on the assumption that the memory is allocated in pages.
+ *
+ * The memory pages covered by the current size of a resizable structure are
+ * fully allocated when the currently allocated part of the structure is written
+ * to. The memory page where the maximal structure ends also hosts the next
+ * structure, unless the maximal structure ends on a page boundary. Hence that
+ * page is allocated when the next structure is written to. The memory pages
+ * between the page where the current structure ends and the page where the next
+ * structure starts remain unallocated. Thus the memory allocated for a
+ * resizable structure can be estimated as the total address space reserved for
+ * the structure minus the unallocated memory pages between the current end and
+ * the next structure.
+ */
+static Size
+EstimateAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ Assert(entry->maximum_size >= entry->size);
+ Assert(entry->allocated_space >= entry->maximum_size);
+
+ if (align_end < floor_max_end)
+ return entry->allocated_space - (floor_max_end - align_end);
+
+ return entry->allocated_space;
+}
+
+/*
+ * ShmemResizeStruct() --- resize a resizable shared memory structure.
+ *
+ * If the structure is being shrunk, the memory pages that are no longer needed
+ * are freed. If the structure is being expanded, the memory pages that are
+ * needed for the new size are allocated. See EstimateAllocatedSize() for
+ * explanation of which pages are allocated for a resizable structure.
+ */
+void
+ShmemResizeStruct(const char *name, Size new_size)
+{
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /*
+ * Resizable shared memory structures are only supported with mmap'ed
+ * memory.
+ */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ Assert(result);
+
+ if (result->maximum_size <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\"" \
+ "(required %zu bytes, reserved %zu bytes)",
+ name, new_size, result->maximum_size)));
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the
+ * start of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+
+ if (max_end > new_end)
+ PGSharedMemoryEnsureFreed(new_end, max_end - new_end);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+
+ if (new_end > struct_start)
+ PGSharedMemoryEnsureAllocated(struct_start, new_end - struct_start);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ result->allocated_size = EstimateAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -723,6 +922,10 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(!found);
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
+#ifdef HAVE_RESIZABLE_SHMEM
+ result->maximum_size = 0;
+ result->allocated_space = result->allocated_size;
+#endif
result->location = ShmemAllocator->index;
}
}
@@ -1064,7 +1267,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 6
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1086,7 +1289,23 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
+#ifdef HAVE_RESIZABLE_SHMEM
+ values[4] = Int64GetDatum(ent->maximum_size);
+ values[5] = Int64GetDatum(ent->allocated_space);
+
+ /*
+ * Keep track of the total allocated space for named shmem areas, to
+ * be able to calculate the amount of shared memory allocated for
+ * anonymous areas and the amount of free shared memory at the end of
+ * the segment.
+ */
+ named_allocated += ent->allocated_space;
+#else
+ values[4] = Int64GetDatum(0);
+ values[5] = Int64GetDatum(ent->allocated_size);
+
named_allocated += ent->allocated_size;
+#endif
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -1097,6 +1316,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1105,6 +1326,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -1292,23 +1515,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a315c4ab8ab..b4d98a1f610 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1211,6 +1211,13 @@
max => '1000.0',
},
+{ name => 'have_resizable_shmem', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+ short_desc => 'Shows whether the running server supports resizable shared memory.',
+ flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE',
+ variable => 'have_resizable_shmem_enabled',
+ boot_val => 'HAVE_RESIZABLE_SHMEM_ENABLED',
+},
+
{ name => 'hba_file', type => 'string', context => 'PGC_POSTMASTER', group => 'FILE_LOCATIONS',
short_desc => 'Sets the server\'s "hba" configuration file.',
flags => 'GUC_SUPERUSER_ONLY',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d9ca13baff9..6bb08dd10f1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -653,6 +653,13 @@ static bool assert_enabled = DEFAULT_ASSERT_ENABLED;
#endif
static bool exec_backend_enabled = EXEC_BACKEND_ENABLED;
+#ifdef HAVE_RESIZABLE_SHMEM
+#define HAVE_RESIZABLE_SHMEM_ENABLED true
+#else
+#define HAVE_RESIZABLE_SHMEM_ENABLED false
+#endif
+static bool have_resizable_shmem_enabled = HAVE_RESIZABLE_SHMEM_ENABLED;
+
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bd177aebfcb..db49042d828 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8664,8 +8664,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,maximum_size,allocated_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9f6d512347e..8f2a59ec3a8 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -85,6 +85,14 @@
don't. */
#undef HAVE_DECL_F_FULLFSYNC
+/* Define to 1 if you have the declaration of `MADV_POPULATE_WRITE', and to 0
+ if you don't. */
+#undef HAVE_DECL_MADV_POPULATE_WRITE
+
+/* Define to 1 if you have the declaration of `MADV_REMOVE', and to 0 if you
+ don't. */
+#undef HAVE_DECL_MADV_REMOVE
+
/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
don't. */
#undef HAVE_DECL_MEMSET_S
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 521b49b8888..b09d6c91324 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -131,6 +131,15 @@
#define EXEC_BACKEND
#endif
+/*
+ * HAVE_RESIZABLE_SHMEM indicates whether resizable shared memory structures are
+ * supported. The implementation requires Linux-specific madvise constants
+ * (MADV_REMOVE and MADV_POPULATE_WRITE).
+ */
+#if HAVE_DECL_MADV_REMOVE && HAVE_DECL_MADV_POPULATE_WRITE && !defined(EXEC_BACKEND)
+#define HAVE_RESIZABLE_SHMEM
+#endif
+
/*
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
* posix_fadvise() kernel call. Usually the automatic configure tests are
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..3d5aceba59c 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,11 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+#endif
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 91218db6d6e..122bf7943ca 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,18 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for large resizable structures which need several pages
+ * worth of contiguous memory.
+ */
+ ssize_t maximum_size;
+#endif
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
@@ -166,6 +178,9 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void ShmemResizeStruct(const char *name, Size new_size);
+#endif
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index f1b04c99969..2a1e746bf0c 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index fc99552d9ab..cd94e1fea15 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..82dae722aad
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,25 @@
+# src/test/modules/resizable_shmem/Makefile
+
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+MODULES = resizable_shmem
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/src/makefiles/pgxs.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..c1bcb6117b6
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,37 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..8aff267db46
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,329 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "commands/extension.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/* Default values for the GUCs controlling structure size */
+#define TEST_INITIAL_ENTRIES_DEFAULT (25 * 1024 * 1024) /* ~100MB */
+#define TEST_MAX_ENTRIES_DEFAULT (100 * 1024 * 1024) /* ~400MB */
+
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * The test performs resizing, reads or writes, only one at a time and never
+ * concurrently. Hence, there is no need for locks in the test structure.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+/* GUC variables controlling the size of the test structure */
+static int test_initial_entries;
+static int test_max_entries;
+
+/* Whether to use SHMEM_ATTACH_UNKNOWN_SIZE when attaching to the shared memory */
+static bool use_unknown_size = false;
+
+static void resizable_shmem_request(void *arg);
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemCallbacks shmem_callbacks = {
+ .request_fn = resizable_shmem_request,
+ .init_fn = resizable_shmem_shmem_init,
+};
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ int guc_context;
+
+ /*
+ * Use PGC_POSTMASTER when loaded at startup so the values are fixed once
+ * the shared memory segment is created. When loaded after startup
+ * PGC_POSTMASTER is not allowed, so we use PGC_SIGHUP instead. Although
+ * we do not intend to change these values at config reload, PGC_SIGHUP is
+ * the least permissive context that allows defining the GUC after startup
+ * and still prevents it from being changed via SET.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ guc_context = PGC_POSTMASTER;
+ else
+ {
+ guc_context = PGC_SIGHUP;
+ shmem_callbacks.flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP;
+ }
+
+ DefineCustomIntVariable("resizable_shmem.initial_entries",
+ "Initial number of entries in the test structure.",
+ NULL,
+ &test_initial_entries,
+ TEST_INITIAL_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ DefineCustomIntVariable("resizable_shmem.max_entries",
+ "Maximum number of entries in the test structure.",
+ NULL,
+ &test_max_entries,
+ TEST_MAX_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ /*
+ * When loaded after startup by a backend that is not creating the
+ * extension, the shared memory might have been resized to a size other
+ * than the initial size. Use SHMEM_ATTACH_UNKNOWN_SIZE to attach without
+ * knowing the exact size.
+ */
+ if (!process_shared_preload_libraries_in_progress && !creating_extension)
+ use_unknown_size = true;
+
+ RegisterShmemCallbacks(&shmem_callbacks);
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void *arg)
+{
+ Size initial_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_max_entries, TEST_ENTRY_SIZE));
+#endif
+
+ /* Register our resizable shared memory structure */
+ ShmemRequestStruct(.name = "resizable_shmem",
+ .size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+#ifdef HAVE_RESIZABLE_SHMEM
+ .maximum_size = max_size,
+#endif
+ .ptr = (void **) &resizable_shmem,
+ );
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+ resizable_shmem->num_entries = test_initial_entries;
+ memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(new_entries, TEST_ENTRY_SIZE));
+ ShmemResizeStruct("resizable_shmem", new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * The function assumes that these values will be available in
+ * /proc/self/status, any system which also support madvise with MADV_REMOVE and
+ * MADV_POPULATE_WRITE.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ FILE *f;
+ char line[256];
+ int64 rss_anon_kb = -1;
+ int64 rss_file_kb = -1;
+ int64 rss_shmem_kb = -1;
+ int64 vm_size_kb = -1;
+ int found = 0;
+ TupleDesc tupdesc;
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+
+ /* Open /proc/self/status to read memory information */
+ f = fopen("/proc/self/status", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/status: %m")));
+
+ /* Look for the memory usage lines */
+ while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ {
+ if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+ found++;
+ else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+ found++;
+ else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+ found++;
+ else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+ found++;
+ }
+
+ fclose(f);
+
+ /* Build tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept a record")));
+
+ /* Build the result tuple */
+ values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+ values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+ values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+ values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..8031303fe0e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,4 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..b3132ea1ae3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,235 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality, both when loaded at startup via
+# shared_preload_libraries and when loaded after startup (late allocation).
+
+# Verify that RssShmem does not exceed the total allocated shared memory.
+# Allocated shared memory should be mostly the memory allocated to the
+# resizable_shmem structure. Any large increase in expected RssShmem should
+# reflect the unexpected increase in memory allocated to the resizable_shmem
+# structure.
+sub check_shmem_usage
+{
+ my ($session, $label, $node) = @_;
+
+ my $rss_shmem = $session->query_safe('SELECT rss_shmem FROM resizable_shmem_usage();',
+ verbose => 0);
+ my $total_alloc = $node->safe_psql('postgres',
+ "SELECT sum(allocated_size) FROM pg_shmem_allocations;");
+
+ note "$label: RssShmem=$rss_shmem, sum(allocated_size)=$total_alloc";
+ ok($rss_shmem <= $total_alloc, "$label: RssShmem does not exceed total allocated size");
+}
+
+# Test a resize operation: resize, verify old data, write new data, verify
+# new data, and check shmem usage. Returns updated ($num_entries, $value).
+sub test_resize
+{
+ my ($node, $prefix, $old_num_entries, $old_value, $new_num_entries, $new_value, $label) = @_;
+
+ $label = "$prefix: $label";
+
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $session1->query_safe("SELECT resizable_shmem_resize($new_num_entries);",
+ verbose => 0);
+
+ # Old data should still be intact in the (possibly smaller) area
+ my $readable_entries = ($new_num_entries < $old_num_entries) ? $new_num_entries : $old_num_entries;
+ is($session1->query_safe("SELECT resizable_shmem_read($readable_entries, $old_value);",
+ verbose => 0),
+ 't', "old data readable after $label");
+
+ $session2->query_safe("SELECT resizable_shmem_write($new_value);",
+ verbose => 0);
+ is($session1->query_safe("SELECT resizable_shmem_read($new_num_entries, $new_value);",
+ verbose => 0),
+ 't', "new data readable after $label");
+
+ check_shmem_usage($session1, "$label (session 1)", $node);
+ check_shmem_usage($session2, "$label (session 2)", $node);
+
+ $session1->quit;
+ $session2->quit;
+
+ return ($new_num_entries, $new_value);
+}
+
+# Run the full suite of resizable shared memory tests on the given node.
+sub run_resizable_tests
+{
+ my ($node, $initial_entries, $max_entries, $prefix) = @_;
+
+ my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+ my $num_entries = $initial_entries;
+
+ # Basic read/write should work on all platforms
+ my $value = 100;
+ $node->safe_psql('postgres', "SELECT resizable_shmem_write($value);");
+ is($node->safe_psql('postgres', "SELECT resizable_shmem_read($num_entries, $value);"),
+ 't', "$prefix: data read after write successful");
+
+ if ($have_resizable_shmem)
+ {
+ # Initial structure state
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $value = 100;
+ # Write and read the initial set of entries.
+ $session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+ is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);",
+ verbose => 0),
+ 't', "$prefix: data read after write successful");
+ check_shmem_usage($session1, "$prefix: initial write (session 1)", $node);
+ check_shmem_usage($session2, "$prefix: initial write (session 2)", $node);
+ $session1->quit;
+ $session2->quit;
+
+ # Verify no other structure is resizable
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> 0;"),
+ '0', "$prefix: no other resizable structures");
+
+ # Resize to maximum
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $max_entries, 500, 'resize to maximum');
+
+ # Shrink to 75% of max
+ my $shrink_entries = int($max_entries * 3 / 4);
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $shrink_entries, 999, 'shrinking');
+
+ # Resize to the same size (no-op)
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $num_entries, 1999, 'no-op resize');
+
+ # Test resize failure (attempt to resize beyond max - should fail)
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ ok($ret != 0 || $stderr =~ /ERROR/, "$prefix: Resize beyond maximum fails");
+ }
+ else
+ {
+ # On unsupported platforms, resizing should fail with a clear error
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize($num_entries);");
+ ok($ret != 0, "$prefix: resize fails on unsupported platform");
+ like($stderr, qr/not supported/, "$prefix: resize error mentions not supported");
+ }
+}
+
+### Set up a test node.
+#
+#Configure minimal shared memory so that the resizable_shmem structure dominates
+#and any unexpected increase is easy to detect.
+###
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+$node->init;
+
+$node->append_conf('postgresql.conf', 'shared_buffers = 128kB');
+$node->append_conf('postgresql.conf', 'max_connections = 5');
+$node->append_conf('postgresql.conf', 'max_worker_processes = 0');
+$node->append_conf('postgresql.conf', 'max_wal_senders = 0');
+$node->append_conf('postgresql.conf', 'max_prepared_transactions = 0');
+$node->append_conf('postgresql.conf', 'max_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'max_pred_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'wal_buffers = 32kB');
+
+###
+# Test 1: Startup allocation via shared_preload_libraries
+###
+my $startup_initial = 25 * 1024 * 1024;
+my $startup_max = 100 * 1024 * 1024;
+
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->append_conf('postgresql.conf', "resizable_shmem.initial_entries = $startup_initial");
+$node->append_conf('postgresql.conf', "resizable_shmem.max_entries = $startup_max");
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $startup_initial, $startup_max, 'startup');
+
+my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+###
+# Test 2: Late allocation (loaded after startup, not in shared_preload_libraries).
+# Use much smaller sizes since only ~100KB of shared memory is available for
+# structures allocated after startup.
+###
+my $late_initial = 5 * 1024;
+my $late_max = 12 * 1024;
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM RESET shared_preload_libraries;
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $late_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $late_max;
+});
+$node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+$node->restart;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $late_initial, $late_max, 'late');
+
+###
+# Test sysv shared memory does not support resizable shmem. Only relevant on
+# platforms that support resizable shmem (HAVE_RESIZABLE_SHMEM), since the
+# module only sets maximum_size in that case.
+###
+if ($have_resizable_shmem)
+{
+ ###
+ # Test 3: Verify that CREATE EXTENSION fails with sysv shared memory
+ # when loaded after startup (not in shared_preload_libraries).
+ ###
+ $node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+
+ # Remove settings that would cause the library to auto-load at startup:
+ # shared_preload_libraries and module-prefixed GUCs. ALTER SYSTEM RESET
+ # only affects postgresql.auto.conf, so we must use adjust_conf to remove
+ # from postgresql.conf.
+ $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.max_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.max_entries', undef);
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_memory_type = 'sysv';
+ });
+
+ $node->restart;
+
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+ ok($ret != 0, 'CREATE EXTENSION fails with resizable shmem on sysv');
+ like($stderr, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'CREATE EXTENSION error mentions shared_memory_type = mmap requirement');
+
+ ###
+ # Test 4: Verify that resizable structures are also rejected with sysv
+ # shared memory when loaded at startup via shared_preload_libraries.
+ ###
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_preload_libraries = 'resizable_shmem';
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $startup_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $startup_max;
+ });
+ $node->stop;
+
+ ok(!$node->start(fail_ok => 1),
+ 'server fails to start with resizable shmem on sysv');
+
+ my $log = slurp_file($node->logfile);
+ like($log, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'log mentions shared_memory_type = mmap requirement');
+}
+
+done_testing();
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c154f57682a..c89b140871f 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -45,5 +45,28 @@ else
ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
}
+###
+# Test that a fixed-size shared memory structure cannot be resized.
+# Only relevant on platforms that support resizable shmem.
+###
+my $have_resizable_shmem =
+ $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+if ($have_resizable_shmem)
+{
+ # Try expanding the fixed-size structure
+ my ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1000);");
+ isnt($ret, 0, "expanding a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "expand error message mentions not resizable");
+
+ # Try shrinking the fixed-size structure
+ ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1);");
+ isnt($ret, 0, "shrinking a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
+}
+
$node->stop;
+
done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
index 2d01fd9256c..e169d0d7733 100644
--- a/src/test/modules/test_shmem/test_shmem--1.0.sql
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -7,3 +7,7 @@
CREATE FUNCTION get_test_shmem_attach_count()
RETURNS pg_catalog.int4 STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION test_shmem_resize_fixed(pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 9bd4012b435..fc2fd67887f 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -99,3 +99,23 @@ get_test_shmem_attach_count(PG_FUNCTION_ARGS)
elog(ERROR, "shmem area not yet initialized");
PG_RETURN_INT32(TestShmem->attach_count);
}
+
+/*
+ * Attempt to resize the fixed-size shared memory structure. This should
+ * fail because the structure was not allocated with a maximum_size.
+ */
+PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
+Datum
+test_shmem_resize_fixed(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_size = PG_GETARG_INT32(0);
+
+ ShmemResizeStruct("test_shmem area", new_size);
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81a73c426d2..b842c7b19cd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,10 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ maximum_size,
+ allocated_space
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size, allocated_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3c35097361d..81bf12cbc1a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestResizableShmemStruct
TestShmemData
TestSpec
TestValueType
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 09:06 Matthias van de Meent <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-05 09:06 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <[email protected]> wrote:
>
> On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> <[email protected]> wrote:
> >
> > I will post my resizable shmem structures patch in a separate email in
> > this thread but continue to review your patches.
> >
>
> Attached is your patchset (0001 - 0014) + resizable shared memory
> structures patchset 0015.
>
> Resizable shared memory structures
> ============================
>
> When allocating memory to the requested shared structures, we allocate
> space for each structure. In mmap'ed shared memory, the memory is
> allocated against those structures only when those structures are
> initialized.
> Resizable shared memory structures are simply allocated maximum space
> when that happens. The function which initializes the structure is
> expected to initialize only the memory worth its initial size. When
> resizing the structure memory is freed or allocated against the
> reserved space depending upon the new size. This allows the structures
> to be resized while keeping their starting address stable which is a
> hard requirement in PostgreSQL.
>
> Resizable shared memory feature depends upon the existence of function
> madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
>
> On the platforms which do not have these, we disable this feature at
> compile time. The commit introduces a compile time flag
> HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
> MADV_WRITE_POPULATE exist. We don't check the existence of madvise
> separately, since the existence of the constants implies the existence
> of the function.
>
> HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since
> that's largely used for Windows where the APIs to free and allocate
> memory from and to a given address space are not known to the author
> right now. Given that PostgreSQL is used widely on Linux, providing
> this feature on Linux covers benefits most of its users. Once we
> figure out the required Windows APIs, we will support this feature on
> Windows as well.
>
> The feature is also not available when Sys-V shared memory is used
> even on Linux since we do not know whether required Sys-V APIs exist;
> mostly they don't. Since that combination is only available for
> development and testing, not supporting the feature isn't going to
> impact PostgreSQL users much.
>
> Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
> resizable shared memory structures on the platforms which do not
> support the feature. But we also have run time checks to disable this
> feature when Sys-V shared memory is used. In order to know whether a
> given instance of a running server supports resizable structures, we
> have introduced GUC have_resizable_shmem.
I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
its platforms to make it part of the exposed ABI for Shmem? I think
that we should expose the same functions and structs, and just have
the shmem internals throw an error if the configuration used by the
user implies the user wants to update shmem sizing when the system
doesn't support it. That would avoid extensions having to recompile
between have/have not systems that have an otherwise compatible ABI;
especially when those extensions don't actually need the resizeable
part of the shmem system.
> Following points are up for discussion
> =============================
>
> 1. calculation of allocated_size of resizable structures
> --------------------------------
> The memory allocated to that structure is the
> {maximum size of the structure} - {total size of unallocated pages}. I
> think setting allocated_size to the actually allocated memory is more
> accurate than {current size of the structure} + {alignment} which does
> not reflect the actual memory allocated to the structure. I would like
> to know what others think.
I agree: For allocated_size, it should be the max size of the
structure (+alignment, if any), minus the total size of its
deallocated pages.
Nit: I think "reserved"/"space_reserved" is a better descriptor than
"allocated_space", as "allocated_space" could reasonably imply the
memory isn't available to the OS.
> 2. maximum_size member in various structures and in pg_shmem_allocations view
> -----------------------------------------------------------------------------
> A resizable structure is requested by specifying non-zero maximum_size
> in ShmemStructOpts. It gets copied to the maximum_size member in
> ShmemStructDesc, ShmemIndexEnt. The question is for fixed-size
> structures what should be the value maximum_size in those structures.
> Setting it to the same value as the size member in the respective
> structure is logical since their maximum size is the same as their
> initial size.
Note that currently, your patch rejects the case where resizeable
structs are initialized at their maximum size:
> +++ b/src/backend/storage/ipc/shmem.c
> +#ifdef HAVE_RESIZABLE_SHMEM
> + if (options->maximum_size > 0 && options->size >= options->maximum_size)
> + elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
> + options->name, options->maximum_size, options->size);
It'd need to check 'options->size > options->maximum_size' to allow
max-sized initialization to succeed here without erroring.
> But if we do so, we need another member in
> ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
> resizable or not. Instead the patches set maximum_size to 0 for
> fixed-size structures and non-zero for resizable structures. This way
> we can check whether a structure is resizable or not by checking
> whether its maximum_size is zero or not. pg_shmem_allocations view
> also has a maximum_size column which has the similar characteristics.
> I would like to know what others think.
I think that shmem allocations can set
.size for the initial size, and
.minimum_size/.maximum_size for configuring resizeability;
The latter fields can then be initialized with .size if they're 0.
> 3. allocated_space member in various structures and in pg_shmem_allocations view
> -------------------------------------------------------------------------------
> The patch adds a new member allocated_space to ShmemIndexEnt and
> pg_shmem_allocations view. allocated_space to maximum_size is what
> allocated_size is to size - it's the type aligned value of
> maximum_size. But it also highlights the difference between the
> address space allocation and the actual memory allocation. This
> difference is crucial to resizable structures. However, unlike
> maximum_size, we set it to a non-zero value, allocated_size, for
> fixed-size structures as well since they are allocated the same amount
> of space as their allocated_size. While this seems logically correct
> to me, some may find maximum_size to be zero but allocated_space to be
> non-zero for fixed-size structures a bit weird. I would like to know
> what others think.
I'd prefer to have consistent values; constant-sized structs are no
different from resizable structs whose min/max size equal their
current size. The only alternative that I think could be considered
correct is returning NULL for those, but zero is definitely wrong.
Note that returning min/max=size would also allow for better
aggregations on pg_shmem_allocations columns.
Note: if we expose minimum_size, we may also want to expose
min_allocated_size (i.e., the full reservation minus the size of
MADV_REMOVEd pages when the shmem allocation is min-sized).
> As a side question, do we want to allow users to specify minimum_size
> in ShmemStructOpts for resizable structures? Resizing memory lower
> than that would be prohibited. For fixed sized structures,
> minimum_size would be same as size and also maximum_size.
I think it would be useful, if only to inform users and developers
about this in e.g. pg_shmem_allocations.
> For now, it
> seems only for the sanity checks, but it could be seen as a useful
> safety feature. A difference in maximum_size and minimum_size would
> indicate that the structure is resizable.
I think that's the right approach.
> 4. to mprotect or not to mprotect
> ---------------------------------
> If memory beyond the current size of a resizable structure is
> accessed, it won't cause any segfault or bus error. When writing
> memory will be simply allocated and when reading, it will return
> zeroes if memory is not allocated yet. mprotect'ing the memory beyond
> the current size of a resizable structure to PROT_NONE can prevent
> accidental access to unallocated memory (sans page boundaries), but it
> needs to be done in every backend process which requires a
> synchronization mechanism beyond the scope of shmem.c. Hence the patch
> does not use mprotect.
It seems to me that the synchronization is a crucial component of
resizing; isn't it bad if shmem structs can suddenly without
synchronization contain zeroes?
> A subsystem will require some higher level
> synchronization mechanism between users of the structure and the
> process which resizes it. That synchronization mechanism can be used
> to mprotect the memory, if required. I have documented this, but I
> would like to know whether we should provide an API in shmem.c to
> mprotect.
I think we should; I think it would simplify and deduplicate external
code that needs to mark the pages PROT_NONE, and centralize OS page
calculations to within the shmem subsystem.
It'd also allow checks that validate that the pages marked with
PROT_NONE are 1) within a shmem allocation and 2) currently not in use
by that shmem allocation.
(Was there a point 5. for discussion? I can't find it)
(This is where I ran out of time for these questions, sorry I didn't
get to point 6)
Kind regards,
Matthias van de Meent
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 10:03 Heikki Linnakangas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 10:03 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 27/03/2026 02:51, Heikki Linnakangas wrote:
> On 25/03/2026 20:37, Robert Haas wrote:
>> On Sat, Mar 21, 2026 at 8:14 PM Heikki Linnakangas <[email protected]>
>> wrote:>> Shmem callbacks
>>> ---------------
>>>
>>> I separated the request/init/fn callbacks from the structs. There's now
>>> a concept of "shmem callbacks", which you register in _PG_init(). For
>>> example:
>>>
>>> static void pgss_shmem_request(void *arg);
>>> static void pgss_shmem_init(void *arg);
>>>
>>> static const ShmemCallbacks pgss_shmem_callbacks = {
>>> .request_fn = pgss_shmem_request,
>>> .init_fn = pgss_shmem_init,
>>> .attach_fn = NULL, /* no special attach actions needed */
>>> };
>>
>> What's the advantage of coupling the functions together this way, vs.
>> just registering each callback individually?
>
> One reason is to support allocations after postmaster startup. The
> RegisterShmemCallbacks() call ties together all the resources requested
> by the request_fn callback, with the the init_fn or attach_fn callbacks
> that will later initialize/attach them. The init_fn/attach_fn callbacks
> are called only after *all* the resources requested by the request_fn
> callback have been initialized, and it holds a lock while doing all that.
>
> If the callbacks were registered separately, shmem.c wouldn't know when
> to call the init_fn/attach_fn. There's no problem during postmaster or
> backend startup, because we run all init_fn or attach_fn callbacks in
> the whole system, after requesting all the resources, but after startup,
> you must only call the callbacks related to the newly-requested resources.
>
> Aside from that after-startup allocation issue, though, IMO the
> ShmemCallbacks struct makes it more clear that the callbacks are meant
> to work together on the same resources.
>
> One way to think of this is that all the resources requested by the
> request_fn callback are implicitly part of the same "subsystem", and
> need to be initialized/attached to together. We discussed that before,
> and I still wonder if we should make that concept of a subsystem more
> explicit. If we just renamed ShmemCallbacks to ShmemSubsystem, and give
> each subsystem a name, it'd look like this:
>
> static void pgss_shmem_request(void *arg);
> static void pgss_shmem_init(void *arg);
>
> static const ShmemSubsystem pgss_shmem_subsystem = {
> .name = "pg_stat_statements"
> .request_fn = pgss_shmem_request,
> .init_fn = pgss_shmem_init,
> .attach_fn = NULL, /* no special attach actions needed */
> };
>
> static void
> pgss_shmem_request(void *arg)
> {
> ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
> /*
> * name is optional in this design, subsystem's name is used if
> * not given
> */
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> });
> }
>
> static void
> pgss_shmem_init(void *arg)
> {
> /* initialize contents of pgss */
> ...
> }
>
> void
> _PG_init(void)
> {
> RegisterShmemSubsystem(&pgss_shmem_subsystem);
> }
>
>
>
> Thinking how this might work without such a struct, registering the
> callbacks separately, here's an alternative design:
>
> static void pgss_shmem_request(void *arg);
> static void pgss_shmem_init(void *arg);
>
> static void
> pgss_shmem_request(void *arg)
> {
> ShmemRequestStruct(&pgssSharedStateDesc, &(ShmemRequestStructOpts) {
> .name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss,
> });
>
> ShmemRegisterInitCallback(&pgss_shmem_init);
> /* no attach callback needed, but for illustration: */
> ShmemRegisterInitCallback(&pgss_shmem_attach);
> }
>
> static void
> pgss_shmem_init(void *arg)
> {
> /* initialize contents of pgss */
> ...
> }
>
> void
> _PG_init(void)
> {
> ShmemRegisterRequestCallback(&pgss_shmem_request);
> }
>
> In this design, the ShmemRegisterRequestCallback() call still ties
> together all the related resources. All the resources requested in the
> request-callback are initialized together, and the fact that the init/
> attach callbacks are registered within the request callback associates
> them with the resources. This feels a little Rube Goldbergian, with one
> callback registering more callbacks, but would also work.
Thinking about this some more, we could also just pass the callback
functions directly as arguments to the ShmemRegisterCallback() function,
without the ShmemCallbacks struct. If they're all passed in one call,
that still ties them together. The shmem.c implementation would probably
still need the ShmemCallbacks struct, but that would be a detail
internal to shmem.c. It would look like this:
static void pgss_shmem_request(void *arg);
static void pgss_shmem_init(void *arg);
static void
pgss_shmem_request(void *arg)
{
ShmemRequestStruct(
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss,
);
}
static void
pgss_shmem_init(void *arg)
{
/* initialize contents of pgss */
...
}
void
_PG_init(void)
{
RegisterShmemSubsystem(pgss_shmem_request,
pgss_shmem_init,
NULL, /* no attach fn needed */
0, /* flags */);
}
This is pretty much the same as what's in the latest patch version, but
a little less boilerplate as you don't need the ShmemCallbacks struct.
The struct would be useful if we had needs to add lots of optional
options in the future, but I don't think we have such needs.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 11:20 Ashutosh Bapat <[email protected]>
parent: Matthias van de Meent <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 11:20 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Apr 5, 2026 at 2:36 PM Matthias van de Meent
<[email protected]> wrote:
>
> On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <[email protected]> wrote:
> >
> > On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> > <[email protected]> wrote:
> > >
>
> I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
> its platforms to make it part of the exposed ABI for Shmem? I think
> that we should expose the same functions and structs, and just have
> the shmem internals throw an error if the configuration used by the
> user implies the user wants to update shmem sizing when the system
> doesn't support it. That would avoid extensions having to recompile
> between have/have not systems that have an otherwise compatible ABI;
> especially when those extensions don't actually need the resizeable
> part of the shmem system.
>
I don't think I understand this fully. An extension may want to
support a structure in both modes - fixed as well as resizable
depending upon whether the latter is supported. If the structure has
maximum_size always the extension code needs to set it to 0 when the
resizable shared structure is not supported and set to actual
maximum_size when the resizable structure is supported. Without a
macro or some flag they can not do that. The flag/macro then becomes
part ABI for shmem. Am I correct? Since extension binaries need to be
built on different platforms anyway, that would automatically take
care of building with or without HAVE_RESIZABLE_SHMEM. I feel it makes
testing simpler since run time behaviour is fixed. Maybe I am missing
something. Maybe a code diff or some example platform might make it
more clear for me.
> > Following points are up for discussion
> > =============================
>
> Nit: I think "reserved"/"space_reserved" is a better descriptor than
> "allocated_space", as "allocated_space" could reasonably imply the
> memory isn't available to the OS.
Renamed it to reserved_space. New name is also less confusing with
allocation_size.
>
> Note that currently, your patch rejects the case where resizeable
> structs are initialized at their maximum size:
>
> > +++ b/src/backend/storage/ipc/shmem.c
>
> > +#ifdef HAVE_RESIZABLE_SHMEM
> > + if (options->maximum_size > 0 && options->size >= options->maximum_size)
> > + elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
> > + options->name, options->maximum_size, options->size);
>
> It'd need to check 'options->size > options->maximum_size' to allow
> max-sized initialization to succeed here without erroring.
good catch. FIxed in the attached patch.
>
> > But if we do so, we need another member in
> > ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
> > resizable or not. Instead the patches set maximum_size to 0 for
> > fixed-size structures and non-zero for resizable structures. This way
> > we can check whether a structure is resizable or not by checking
> > whether its maximum_size is zero or not. pg_shmem_allocations view
> > also has a maximum_size column which has the similar characteristics.
> > I would like to know what others think.
>
> I think that shmem allocations can set
>
> .size for the initial size, and
> .minimum_size/.maximum_size for configuring resizeability;
>
> The latter fields can then be initialized with .size if they're 0.
>
>
> > 3. allocated_space member in various structures and in pg_shmem_allocations view
> > -------------------------------------------------------------------------------
> > The patch adds a new member allocated_space to ShmemIndexEnt and
> > pg_shmem_allocations view. allocated_space to maximum_size is what
> > allocated_size is to size - it's the type aligned value of
> > maximum_size. But it also highlights the difference between the
> > address space allocation and the actual memory allocation. This
> > difference is crucial to resizable structures. However, unlike
> > maximum_size, we set it to a non-zero value, allocated_size, for
> > fixed-size structures as well since they are allocated the same amount
> > of space as their allocated_size. While this seems logically correct
> > to me, some may find maximum_size to be zero but allocated_space to be
> > non-zero for fixed-size structures a bit weird. I would like to know
> > what others think.
>
> I'd prefer to have consistent values; constant-sized structs are no
> different from resizable structs whose min/max size equal their
> current size. The only alternative that I think could be considered
> correct is returning NULL for those, but zero is definitely wrong.
>
> Note that returning min/max=size would also allow for better
> aggregations on pg_shmem_allocations columns.
>
> Note: if we expose minimum_size, we may also want to expose
> min_allocated_size (i.e., the full reservation minus the size of
> MADV_REMOVEd pages when the shmem allocation is min-sized).
>
> > As a side question, do we want to allow users to specify minimum_size
> > in ShmemStructOpts for resizable structures? Resizing memory lower
> > than that would be prohibited. For fixed sized structures,
> > minimum_size would be same as size and also maximum_size.
>
> I think it would be useful, if only to inform users and developers
> about this in e.g. pg_shmem_allocations.
>
> > For now, it
> > seems only for the sanity checks, but it could be seen as a useful
> > safety feature. A difference in maximum_size and minimum_size would
> > indicate that the structure is resizable.
>
> I think that's the right approach.
I also think that introducing minimum_size is useful. Let's hear from
Heikki before implementing it, in case he has a different opinion. I
am not sure about min_allocated_space though - what use do you see for
it. reserved_space is useful in pg_shmem_allocations() C function
itself and gives impact to the fully grown structure. What would
min_allocated_space give us? If at all it would be min_allocated_size
not space since reserved space will never change. But even that I am
not sure about.
>
> > 4. to mprotect or not to mprotect
> > ---------------------------------
> > If memory beyond the current size of a resizable structure is
> > accessed, it won't cause any segfault or bus error. When writing
> > memory will be simply allocated and when reading, it will return
> > zeroes if memory is not allocated yet. mprotect'ing the memory beyond
> > the current size of a resizable structure to PROT_NONE can prevent
> > accidental access to unallocated memory (sans page boundaries), but it
> > needs to be done in every backend process which requires a
> > synchronization mechanism beyond the scope of shmem.c. Hence the patch
> > does not use mprotect.
>
> It seems to me that the synchronization is a crucial component of
> resizing; isn't it bad if shmem structs can suddenly without
> synchronization contain zeroes?
>
> > A subsystem will require some higher level
> > synchronization mechanism between users of the structure and the
> > process which resizes it. That synchronization mechanism can be used
> > to mprotect the memory, if required. I have documented this, but I
> > would like to know whether we should provide an API in shmem.c to
> > mprotect.
>
> I think we should; I think it would simplify and deduplicate external
> code that needs to mark the pages PROT_NONE, and centralize OS page
> calculations to within the shmem subsystem.
> It'd also allow checks that validate that the pages marked with
> PROT_NONE are 1) within a shmem allocation and 2) currently not in use
> by that shmem allocation.
Reasonable. Let's wait for Heikki's opinion on this as well before
implementing it.
>
> (Was there a point 5. for discussion? I can't find it)
There is no point 5, just bad numbering.
>
> (This is where I ran out of time for these questions, sorry I didn't
> get to point 6)
CFBot did show some failures.
1. Makefile didn't define PGXS, fixed it.
2. Windows compiler didn't like #ifdef in the middle of function like
macro argument list C5101 and C2059. Used a conditional macro instead.
3. The test fails one one machine because RssShmem is consistently 8MB
higher than the allocated_size in all cases. I guess it is because of
huge page setting. Adding huge_pages = off to the test configuration.
I think the test can not rely on huge pages anyway since
allocated_size isn't aligned to huge page size.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] v20260405_2-0001-refactor-Move-ShmemInitHash-to-separate-fi.patch (11.1K, 2-v20260405_2-0001-refactor-Move-ShmemInitHash-to-separate-fi.patch)
download | inline diff:
From f08f4d2cc196a210dca1edd7195656c62481fea3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 13:07:28 +0200
Subject: [PATCH v20260405 01/15] refactor: Move ShmemInitHash to separate file
In preparation for next commits
---
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 108 ----------------------
src/backend/storage/ipc/shmem_hash.c | 130 +++++++++++++++++++++++++++
src/include/storage/shmem.h | 9 +-
5 files changed, 139 insertions(+), 110 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cb51ad62f8..c994f7674ec 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -96,9 +96,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -257,29 +254,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/*
- * ShmemHashAlloc -- alloc callback for shared memory hash tables
- *
- * Carve out the allocation from a pre-allocated region. All shared memory
- * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
- * happen upfront during initialization and no locking is required.
- */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
- void *result;
-
- size = MAXALIGN(size);
-
- if (allocator->end - allocator->next < size)
- return NULL;
- result = allocator->next;
- allocator->next += size;
-
- return result;
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -341,88 +315,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * nelems is the maximum number of hashtable entries.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 nelems, /* size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- size_t size;
- void *location;
-
- size = hash_estimate_size(nelems, infoP->entrysize);
-
- /* look it up in the shmem index or allocate */
- location = ShmemInitStruct(name, size, &found);
-
- return shmem_hash_create(location, size, found,
- name, nelems, infoP, hash_flags);
-}
-
-/*
- * Initialize or attach to a shared hash table in the given shmem region.
- *
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
- */
-static HTAB *
-shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-{
- shmem_hash_allocator allocator;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory and have
- * all elements allocated upfront. We don't support growing because we'd
- * need to grow the underlying shmem region with it.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (!found)
- {
- allocator.next = (char *) location;
- allocator.end = (char *) location + size;
- infoP->alloc_arg = &allocator;
- }
- else
- {
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
- hash_flags |= HASH_ATTACH;
- }
-
- return hash_create(name, nelems, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..0b05730129e
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,130 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+
+static void *ShmemHashAlloc(Size size, void *alloc_arg);
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * nelems is the maximum number of hashtable entries.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 nelems, /* size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ size_t size;
+ void *location;
+
+ size = hash_estimate_size(nelems, infoP->entrysize);
+
+ /* look it up in the shmem index or allocate */
+ location = ShmemInitStruct(name, size, &found);
+
+ return shmem_hash_create(location, size, found,
+ name, nelems, infoP, hash_flags);
+}
+
+/*
+ * Initialize or attach to a shared hash table in the given shmem region.
+ *
+ * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
+ * share the logic for bootstrapping the ShmemIndex hash table.
+ */
+HTAB *
+shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+{
+ shmem_hash_allocator allocator;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory and have
+ * all elements allocated upfront. We don't support growing because we'd
+ * need to grow the underlying shmem region with it.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (!found)
+ {
+ allocator.next = (char *) location;
+ allocator.end = (char *) location + size;
+ infoP->alloc_arg = &allocator;
+ }
+ else
+ {
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+ hash_flags |= HASH_ATTACH;
+ }
+
+ return hash_create(name, nelems, infoP, hash_flags);
+}
+
+/*
+ * ShmemHashAlloc -- alloc callback for shared memory hash tables
+ *
+ * Carve out the allocation from a pre-allocated region. All shared memory
+ * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
+ * happen upfront during initialization and no locking is required.
+ */
+static void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
+ void *result;
+
+ size = MAXALIGN(size);
+
+ if (allocator->end - allocator->next < size)
+ return NULL;
+ result = allocator->next;
+ allocator->next += size;
+
+ return result;
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index a2eb499d63c..82f5403c952 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -30,15 +30,20 @@ typedef struct PGShmemHeader PGShmemHeader; /* avoid including
extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
+extern void *ShmemHashAlloc(Size size, void *alloc_arg);
extern bool ShmemAddrIsValid(const void *addr);
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+/* shmem_hash.c */
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
base-commit: de28140ded8d4ba00faf905ec3530ffeb8a34a53
--
2.34.1
[text/x-patch] v20260405_2-0004-Convert-pg_stat_statements-to-use-the-new-.patch (11.3K, 3-v20260405_2-0004-Convert-pg_stat_statements-to-use-the-new-.patch)
download | inline diff:
From a82e6c6c64f19fe60a9aca00e243142d77ea974e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:24 +0300
Subject: [PATCH v20260405 04/15] Convert pg_stat_statements to use the new
interface
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 173 ++++++++----------
1 file changed, 77 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5494d41dca1..f078b4fe71b 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,42 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ ShmemRequestHash(.name = "pg_stat_statements hash",
+ .nelems = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ .ptr = &pgss_hash,
+ );
+ ShmemRequestStruct(.name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +544,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1334,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1355,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1374,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1514,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1803,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2041,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2081,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2711,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2805,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.34.1
[text/x-patch] v20260405_2-0003-Add-test-module-to-test-after-startup-shme.patch (10.1K, 4-v20260405_2-0003-Add-test-module-to-test-after-startup-shme.patch)
download | inline diff:
From ab74d146fd33f858d7e61967dc038d476f5af946 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:10:31 +0300
Subject: [PATCH v20260405 03/15] Add test module to test after-startup shmem
allocations
None of the existing modules could make use of the lazy shmem
allocation after postmaster startup:
- pg_stat_statements needs to load and dump stats file on startup and
shutdown, which doesn't really work if the library is not loaded into
postmaster
- test_aio registers injection points, which reference the library
itself, which creates a weird initialization loop if you try to do
that directly from _PG_init() in a backend. The initialization
really needs to happen after _PG_init()
- injection_points would be a candidate, but it already knows to use
DSM when it's not loaded from shared_preload_libraries.
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 +++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 +++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 101 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 222 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..f1b04c99969 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -48,6 +48,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..fc99552d9ab 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -49,6 +49,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..9bd4012b435
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,101 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(.name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem);
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b84167741fb..63c0b3a9465 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.34.1
[text/x-patch] v20260405_2-0005-Introduce-registry-of-built-in-subsystems.patch (7.3K, 5-v20260405_2-0005-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From 293914fb41161bac855afdb0074334eac5cbdace Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 18:21:02 +0300
Subject: [PATCH v20260405 05/15] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its callbacks so that they get called at the right times
during postmaster startup. For now this is unused, but will replace
the current SubsystemShmemSize() and SubsystemShmemInit() calls in
the next commits.
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
src/tools/pginclude/headerscheck | 1 +
9 files changed, 88 insertions(+)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c707ccfa563..49d88a1b6dd 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -363,6 +363,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 0973010b7dc..ed0f4f2d234 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -664,6 +664,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 693475014fe..b2010bce186 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -922,6 +922,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 24422a80ab3..e4a6a52f12d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -252,6 +253,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 93851269e43..6a9ff3ad225 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4138,6 +4138,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 14c466cc237..24f7416185e 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -131,6 +131,7 @@ do
test "$f" = src/include/postmaster/proctypelist.h && continue
test "$f" = src/include/regex/regerrs.h && continue
test "$f" = src/include/storage/lwlocklist.h && continue
+ test "$f" = src/include/storage/subsystemlist.h && continue
test "$f" = src/include/tcop/cmdtaglist.h && continue
test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue
test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue
--
2.34.1
[text/x-patch] v20260405_2-0002-Introduce-a-new-mechanism-for-registering-.patch (62.3K, 6-v20260405_2-0002-Introduce-a-new-mechanism-for-registering-.patch)
download | inline diff:
From 70d13ccc6942d5f94082b4b581ed7418a809ba80 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Wed, 1 Apr 2026 20:01:39 +0300
Subject: [PATCH v20260405 02/15] Introduce a new mechanism for registering
shared memory areas
This merges the separate [Subsystem]ShmemSize() and
[Subsystem]ShmemInit() phases at postmaster startup. Each subsystem is
now called into just once, before the shared memory segment has been
allocated, to register or "request" the subsystem's shared memory
needs. This is more ergonomic, as you only need to calculate the size
once.
This replaces ShmemInitStruct() and ShmemInitHash(), which become just
backwards-compatibility wrappers around the new functions. In future
commits, I plan to replace all ShmemInitStruct() and ShmemInitHash()
calls with the new functions, although we'll still need to keep them
around for extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 5 +
src/backend/postmaster/postmaster.c | 19 +-
src/backend/storage/ipc/ipci.c | 30 +-
src/backend/storage/ipc/shmem.c | 832 ++++++++++++++++++++----
src/backend/storage/ipc/shmem_hash.c | 86 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 10 +-
src/include/storage/shmem.h | 183 +++++-
src/include/storage/shmem_internal.h | 52 ++
src/tools/pgindent/typedefs.list | 9 +-
13 files changed, 1190 insertions(+), 207 deletions(-)
create mode 100644 src/include/storage/shmem_internal.h
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index c52c0a6023d..c707ccfa563 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -373,6 +374,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..0973010b7dc 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,7 +49,9 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -672,7 +674,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eb4f3eb72d4..693475014fe 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -115,6 +115,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "tcop/tcopprot.h"
#include "utils/datetime.h"
@@ -951,7 +952,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3232,7 +3240,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7aab5da3386..24422a80ab3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -100,8 +101,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -176,6 +178,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -220,7 +229,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -251,11 +270,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c994f7674ec..29ff6065dda 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,43 +19,115 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size and
- * cannot grow beyond that. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also a third way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -70,10 +142,80 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructOpts *options;
+ ShmemRequestKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -105,35 +247,379 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (128)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
+static void InitShmemIndexEntry(ShmemRequest *request);
+static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
/*
- * A very simple allocator used to carve out different parts of a hash table
- * from a previously allocated contiguous shared memory area.
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely register the need for it.
+ * The actual allocation happens later in the postmaster startup sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemRequest, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->options->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called once at postmaster startup, before the shared memory segment
+ * has been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+ size = CACHELINEALIGN(size);
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ /* calculate alignment padding like ShmemAllocRaw() does */
+ size = CACHELINEALIGN(size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ InitShmemIndexEntry(request);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachShmemIndexEntry(request, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Insert requested shmem area into the shared memory index and initialize it.
+ *
+ * Note that this only does performs basic initialization depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific init callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
*/
-typedef struct shmem_hash_allocator
+static void
+InitShmemIndexEntry(ShmemRequest *request)
{
- char *next; /* start of free space in the area */
- char *end; /* end of the shmem area */
-} shmem_hash_allocator;
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+ bool found;
+ size_t allocated_size;
+ void *structPtr;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_ENTER_NULL, &found);
+ if (found)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", name);
+ if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ name)));
+ }
+
+ /*
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of shared memory for it, and initialize the index entry.
+ */
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(structPtr, request->options);
+ break;
+ }
+}
+
+/*
+ * Look up a named shmem area in the shared memory index and attach to it.
+ *
+ * Note that this only performs the basic attachment actions depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific attach callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
+ */
+static bool
+AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
+{
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, NULL);
+ if (!index_entry)
+ {
+ if (!missing_ok)
+ ereport(ERROR,
+ (errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ request->options->name)));
+ return false;
+ }
+
+ /* Check that the size in the index matches the request. */
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different size: existing %zu, requested %zu",
+ name, index_entry->size, request->options->size)));
+ }
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(index_entry->location, request->options);
+ break;
+ }
+
+ return true;
+}
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_nelems;
HASHCTL info;
int hash_flags;
@@ -142,6 +628,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -186,19 +682,21 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_nelems = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_FIXED_SIZE;
if (!IsUnderPostmaster)
{
- ShmemAllocator->index_size = hash_estimate_size(SHMEM_INDEX_SIZE, info.entrysize);
+ ShmemAllocator->index_size = hash_estimate_size(hash_nelems, info.entrysize);
ShmemAllocator->index = (HASHHDR *) ShmemAlloc(ShmemAllocator->index_size);
}
ShmemIndex = shmem_hash_create(ShmemAllocator->index,
ShmemAllocator->index_size,
IsUnderPostmaster,
- "ShmemIndex", SHMEM_INDEX_SIZE,
+ "ShmemIndex", hash_nelems,
&info, hash_flags);
Assert(ShmemIndex != NULL);
@@ -219,6 +717,23 @@ InitShmemAllocator(PGShmemHeader *seghdr)
}
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -316,92 +831,191 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /*
+ * After-startup initialization or attachment. Call the appropriate
+ * callbacks immmediately.
+ */
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- Assert(ShmemIndex != NULL);
+ CallShmemCallbacksAfterStartup(callbacks);
+ }
+ else
+ {
+ /* Remember the callbacks for later */
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+ }
+}
+
+/*
+ * Register a shmem area (or multiple areas) after startup.
+ */
+static void
+CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
+{
+ bool found_any;
+ bool notfound_any;
+
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
+
+ /*
+ * Call the request callback first. The callback make ShmemRequest*()
+ * calls for each shmem area, adding them to pending_shmem_requests.
+ */
+ Assert(pending_shmem_requests == NIL);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
+
+ if (pending_shmem_requests == NIL)
+ {
+ shmem_request_state = SRS_DONE;
+ return;
+ }
+ /* Hold ShmemIndexLock while we allocate all the shmem entries */
LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ /*
+ * Check if the requested shared memory areas have already been
+ * initialized. We assume all the areas requested by the request callback
+ * to form a coherent unit such that they're all already initialized or
+ * none. Otherwise it would be ambiguous which callback, init or attach,
+ * to callback afterwards.
+ */
+ found_any = notfound_any = false;
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ if (hash_search(ShmemIndex, request->options->name, HASH_FIND, NULL))
+ found_any = true;
+ else
+ notfound_any = true;
+ }
+ if (found_any && notfound_any)
+ elog(ERROR, "found some but not all");
- if (!result)
+ /*
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
+ if (found_any)
+ AttachShmemIndexEntry(request, false);
+ else
+ InitShmemIndexEntry(request);
}
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- if (*foundPtr)
+ /* Finish by calling the appropriate subsystem-specific callback */
+ if (found_any)
{
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
}
else
{
- Size allocated_size;
-
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->init_fn_arg);
}
LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+}
- Assert(ShmemAddrIsValid(structPtr));
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
+
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->request_fn_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ void *ptr = NULL;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ .ptr = &ptr,
+ };
+ ShmemRequest request = {&options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /*
+ * During postmaster startup, look up the existing entry if any.
+ */
+ *foundPtr = false;
+ if (IsUnderPostmaster)
+ *foundPtr = AttachShmemIndexEntry(&request, true);
+
+ /* Initialize it if not found */
+ if (!*foundPtr)
+ InitShmemIndexEntry(&request);
+
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(ptr != NULL);
+ return ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 0b05730129e..ab30461f247 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -21,9 +21,81 @@
#include "postgres.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
+#include "utils/memutils.h"
+
+/*
+ * A very simple allocator used to carve out different parts of a hash table
+ * from a previously allocated contiguous shared memory area.
+ */
+typedef struct shmem_hash_allocator
+{
+ char *next; /* start of free space in the area */
+ char *end; /* end of the shmem area */
+} shmem_hash_allocator;
static void *ShmemHashAlloc(Size size, void *alloc_arg);
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_estimate_size(options_copy->nelems,
+ options_copy->hash_info.entrysize);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ options->hash_info.hctl = location;
+ htab = shmem_hash_create(location, options->base.size, false,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
+void
+shmem_hash_attach(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = location;
+ Assert(options->hash_info.hctl != NULL);
+ htab = shmem_hash_create(location, options->base.size, true,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -40,9 +112,8 @@ static void *ShmemHashAlloc(Size size, void *alloc_arg);
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
@@ -56,7 +127,14 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
size = hash_estimate_size(nelems, infoP->entrysize);
- /* look it up in the shmem index or allocate */
+ /*
+ * Look it up in the shmem index or allocate.
+ *
+ * NOTE: The area is requested internally as SHMEM_KIND_STRUCT instead of
+ * SHMEM_KIND_HASH. That's correct because we do the hash table
+ * initialization by calling shmem_hash_create() ourselves. (We don't
+ * expose the request kind to users; if we did, that would be confusing.)
+ */
location = ShmemInitStruct(name, size, &found);
return shmem_hash_create(location, size, found,
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..93851269e43 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -67,6 +67,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinval.h"
#include "storage/standby.h"
#include "tcop/backend_startup.h"
@@ -4155,7 +4156,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 82f5403c952..147a6915f7e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -3,6 +3,11 @@
* shmem.h
* shared memory management structures
*
+ * This file contains public functions for other core subsystems and
+ * extensions to allocate shared memory. Internal functions for the shmem
+ * allocator itself and hooking it to the rest of the system are in
+ * shmem_internal.h
+ *
* Historical note:
* A long time ago, Postgres' shared memory region was allowed to be mapped
* at a different address in each process, and shared memory "pointers" were
@@ -23,43 +28,165 @@
#include "utils/hsearch.h"
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
-/* shmem.c */
-typedef struct PGShmemHeader PGShmemHeader; /* avoid including
- * storage/pg_shmem.h here */
-extern void InitShmemAllocator(PGShmemHeader *seghdr);
-extern void *ShmemAlloc(Size size);
-extern void *ShmemAllocNoError(Size size);
-extern void *ShmemHashAlloc(Size size, void *alloc_arg);
+ /*
+ * Requested size of the shmem allocation.
+ *
+ * When attaching to an existing allocation, the size must match the size
+ * given when the shmem region was allocated. This cross-check can be
+ * disabled specifying SHMEM_ATTACH_UNKNOWN_SIZE.
+ */
+ ssize_t size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * 'nelems' is the max number of elements for the hash table.
+ */
+ int64 nelems;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *arg);
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+ void *request_fn_arg;
+
+ /*
+ * Initialization callback function. This is called when the shared
+ * memory area is allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+ void *init_fn_arg;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+ void *attach_fn_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * ALLOW_AFTER_STARTUP: Allow these shared memory usages to be registered
+ * after postmaster startup. Normally, registering a shared memory system
+ * after postmaster startup is not allowed e.g. in an add-in library loaded
+ * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
+ * called immediately after registration, to initialize or attach to the
+ * requested shared memory areas. This is not used by any built-in
+ * subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(...) \
+ ShmemRequestStructWithOpts(&(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(...) \
+ ShmemRequestHashWithOpts(&(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(const ShmemHashOpts *options);
+
+/* legacy shmem allocation functions */
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern void *ShmemAlloc(Size size);
+extern void *ShmemAllocNoError(Size size);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
-/* shmem_hash.c */
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
-extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
new file mode 100644
index 00000000000..fe12bf33439
--- /dev/null
+++ b/src/include/storage/shmem_internal.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_internal.h
+ * Internal functions related to shmem allocation
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/shmem_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SHMEM_INTERNAL_H
+#define SHMEM_INTERNAL_H
+
+#include "storage/shmem.h"
+#include "utils/hsearch.h"
+
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemRequestKind;
+
+/* shmem.c */
+typedef struct PGShmemHeader PGShmemHeader; /* avoid including
+ * storage/pg_shmem.h here */
+extern void ShmemCallRequestCallbacks(void);
+extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
+extern void ResetShmemAllocator(void);
+
+extern void ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind);
+
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
+
+extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+
+/* shmem_hash.c */
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void shmem_hash_init(void *location, ShmemStructOpts *options);
+extern void shmem_hash_attach(void *location, ShmemStructOpts *options);
+
+#endif /* SHMEM_INTERNAL_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c72f6c59573..b84167741fb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2863,9 +2863,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemRequestKind
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.34.1
[text/x-patch] v20260405_2-0006-Convert-lwlock.c-to-use-the-new-interface.patch (6.4K, 7-v20260405_2-0006-Convert-lwlock.c-to-use-the-new-interface.patch)
download | inline diff:
From 7fcabd638027dab38497227a8caf1a9256f769eb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:18:05 +0300
Subject: [PATCH v20260405 06/15] Convert lwlock.c to use the new interface
It seems like a good candidate to convert first because it needs to
initialized before any other subsystem, but other than that it's
nothing special.
---
src/backend/storage/ipc/ipci.c | 13 ------
src/backend/storage/lmgr/lwlock.c | 71 +++++++++++++++--------------
src/include/storage/lwlock.h | 2 -
src/include/storage/subsystemlist.h | 9 +++-
4 files changed, 45 insertions(+), 50 deletions(-)
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index e4a6a52f12d..de65a9ef33c 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -121,7 +121,6 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, LWLockShmemSize());
size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, SharedInvalShmemSize());
@@ -179,11 +178,6 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
- /*
- * Attach to LWLocks first. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
CreateOrAttachShmemStructs();
@@ -230,13 +224,6 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /*
- * Initialize LWLocks first, in case any of the shmem init function use
- * LWLocks. (Nothing else can be running during startup, so they don't
- * need to do any locking yet, but we nevertheless allow it.)
- */
- LWLockShmemInit();
-
/* Initialize all shmem areas */
ShmemInitRequested();
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 5cb696490d6..30b715ab051 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -84,6 +84,7 @@
#include "storage/proclist.h"
#include "storage/procnumber.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -212,6 +213,15 @@ typedef struct NamedLWLockTrancheRequest
static List *NamedLWLockTrancheRequests = NIL;
+static void LWLockShmemRequest(void *arg);
+static void LWLockShmemInit(void *arg);
+
+const ShmemCallbacks LWLockCallbacks = {
+ .request_fn = LWLockShmemRequest,
+ .init_fn = LWLockShmemInit,
+};
+
+
static void InitializeLWLocks(int numLocks);
static inline void LWLockReportWaitStart(LWLock *lock);
static inline void LWLockReportWaitEnd(void);
@@ -401,58 +411,51 @@ NumLWLocksForNamedTranches(void)
}
/*
- * Compute shmem space needed for user-defined tranches and the main LWLock
- * array.
+ * Request shmem space for user-defined tranches and the main LWLock array.
*/
-Size
-LWLockShmemSize(void)
+static void
+LWLockShmemRequest(void *arg)
{
- Size size;
int numLocks;
+ Size size;
+
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
/* Space for user-defined tranches */
size = sizeof(LWLockTrancheShmemData);
-
- /* Space for the LWLock array */
- numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
size = add_size(size, mul_size(numLocks, sizeof(LWLockPadded)));
+ ShmemRequestStruct(.name = "LWLock tranches",
+ .size = size,
+ .ptr = (void **) &LWLockTranches,
+ );
- return size;
+ /* Space for the LWLock array */
+ ShmemRequestStruct(.name = "Main LWLock array",
+ .size = numLocks * sizeof(LWLockPadded),
+ .ptr = (void **) &MainLWLockArray,
+ );
}
/*
- * Allocate shmem space for user-defined tranches and the main LWLock array,
- * and initialize it.
+ * Initialize shmem space for user-defined tranches and the main LWLock array.
*/
-void
-LWLockShmemInit(void)
+static void
+LWLockShmemInit(void *arg)
{
int numLocks;
- bool found;
- LWLockTranches = (LWLockTrancheShmemData *)
- ShmemInitStruct("LWLock tranches", sizeof(LWLockTrancheShmemData), &found);
- if (!found)
- {
- /* Calculate total number of locks needed in the main array */
- LWLockTranches->num_main_array_locks =
- NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
+ numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
- /* Initialize the dynamic-allocation counter for tranches */
- LWLockTranches->num_user_defined = 0;
+ /* Remember total number of locks needed in the main array */
+ LWLockTranches->num_main_array_locks = numLocks;
- SpinLockInit(&LWLockTranches->lock);
- }
+ /* Initialize the dynamic-allocation counter for tranches */
+ LWLockTranches->num_user_defined = 0;
- /* Allocate and initialize the main array */
- numLocks = LWLockTranches->num_main_array_locks;
- MainLWLockArray = (LWLockPadded *)
- ShmemInitStruct("Main LWLock array", numLocks * sizeof(LWLockPadded), &found);
- if (!found)
- {
- /* Initialize all LWLocks */
- InitializeLWLocks(numLocks);
- }
+ SpinLockInit(&LWLockTranches->lock);
+
+ /* Allocate and initialize all LWLocks in the main array */
+ InitializeLWLocks(numLocks);
}
/*
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 61f0dbe749a..efa5b427e9f 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -126,8 +126,6 @@ extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
-extern Size LWLockShmemSize(void);
-extern void LWLockShmemInit(void);
extern void InitLWLockAccess(void);
extern const char *GetLWLockIdentifier(uint32 classId, uint16 eventId);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..f0cf01f5a85 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,11 @@
* of these matter.
*/
-/* TODO: empty for now */
+/*
+ * LWLocks first, in case any of the other shmem init functions use LWLocks.
+ * (Nothing else can be running during startup, so they don't need to do any
+ * locking yet, but we nevertheless allow it.)
+ */
+PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
+
+/* TODO: nothing else for now */
--
2.34.1
[text/x-patch] v20260405_2-0007-Use-the-new-mechanism-in-a-few-core-subsys.patch (46.4K, 8-v20260405_2-0007-Use-the-new-mechanism-in-a-few-core-subsys.patch)
download | inline diff:
From cd78b19913b9dfa0caadfa2bb78798af4666fe5b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:21:17 +0300
Subject: [PATCH v20260405 07/15] Use the new mechanism in a few core
subsystems
I chose these subsystems specifically because they have some
complicating properties, making them slightly harder to convert than
most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProgGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
I'm believe converting all the rest of the subsystems after this will
be pretty mechanic.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 35 ++---
src/backend/port/posix_sema.c | 22 ++-
src/backend/port/sysv_sema.c | 21 ++-
src/backend/port/win32_sema.c | 11 +-
src/backend/storage/ipc/dsm.c | 64 +++++----
src/backend/storage/ipc/dsm_registry.c | 36 ++---
src/backend/storage/ipc/ipci.c | 28 ----
src/backend/storage/ipc/latch.c | 8 +-
src/backend/storage/ipc/pmsignal.c | 51 ++++---
src/backend/storage/ipc/procarray.c | 110 +++++++-------
src/backend/storage/ipc/procsignal.c | 64 ++++-----
src/backend/storage/ipc/sinvaladt.c | 38 ++---
src/backend/storage/lmgr/proc.c | 191 +++++++++++++------------
src/backend/utils/hash/dynahash.c | 3 +-
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pg_sema.h | 6 +-
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 17 ++-
25 files changed, 344 insertions(+), 381 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..dc5e32d86f3 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,25 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 40205b7d400..53e4a7a5c38 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -159,22 +159,24 @@ PosixSemaphoreKill(sem_t *sem)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
#ifdef USE_NAMED_POSIX_SEMAPHORES
/* No shared memory needed in this case */
- return 0;
#else
/* Need a PGSemaphoreData per semaphore */
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
#endif
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -193,10 +195,9 @@ PGSemaphoreShmemSize(int maxSemas)
* we don't have to expose the counters to other processes.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -214,11 +215,6 @@ PGReserveSemaphores(int maxSemas)
mySemPointers = (sem_t **) malloc(maxSemas * sizeof(sem_t *));
if (mySemPointers == NULL)
elog(PANIC, "out of memory");
-#else
-
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
#endif
numSems = 0;
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 4b2bf84072f..98d99515043 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -301,16 +301,20 @@ IpcSemaphoreCreate(int numSems)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ /* Need a PGSemaphoreData per semaphore */
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -327,10 +331,9 @@ PGSemaphoreShmemSize(int maxSemas)
* have clobbered.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -344,10 +347,6 @@ PGReserveSemaphores(int maxSemas)
errmsg("could not stat data directory \"%s\": %m",
DataDir)));
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
-
numSharedSemas = 0;
maxSharedSemas = maxSemas;
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index ba97c9b2d64..a3202554769 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -25,17 +25,16 @@ static void ReleaseSemaphores(int code, Datum arg);
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
/* No shared memory needed on Windows */
- return 0;
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* In the Win32 implementation, we acquire semaphores on-demand; the
* maxSemas parameter is just used to size the array that keeps track of
@@ -44,7 +43,7 @@ PGSemaphoreShmemSize(int maxSemas)
* process exits.
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
mySemSet = (HANDLE *) malloc(maxSemas * sizeof(HANDLE));
if (mySemSet == NULL)
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..8b69df4ff26 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -109,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static size_t dsm_main_space_size;
+
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
/*
* List of dynamic shared memory segments used by this backend.
@@ -464,42 +474,40 @@ dsm_set_control_handle(dsm_handle h)
#endif
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ dsm_main_space_size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (dsm_main_space_size == 0)
+ return;
+
+ ShmemRequestStruct(.name = "Preallocated DSM",
+ .size = dsm_main_space_size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
- if (size == 0)
+ if (dsm_main_space_size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (dsm_main_space_size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..2b56977659b 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,20 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRequestStruct(.name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index de65a9ef33c..4f707158303 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -42,16 +41,11 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
-#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -105,14 +99,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -121,11 +111,7 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -278,13 +264,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -307,23 +289,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 8537e9fef2d..7d4f4cf32bb 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -80,10 +80,10 @@ InitLatch(Latch *latch)
* current process.
*
* InitSharedLatch needs to be called in postmaster before forking child
- * processes, usually right after allocating the shared memory block
- * containing the latch with ShmemInitStruct. (The Unix implementation
- * doesn't actually require that, but the Windows one does.) Because of
- * this restriction, we have no concurrency issues to worry about here.
+ * processes, usually right after initializing the shared memory block
+ * containing the latch. (The Unix implementation doesn't actually require
+ * that, but the Windows one does.) Because of this restriction, we have no
+ * concurrency issues to worry about here.
*
* Note that other handles created in this module are never marked as
* inheritable. Thus we do not need to worry about cleaning up child
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..bdad5fdd043 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
+ size_t size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(.name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..f540bb6b23f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,11 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
static bool *KnownAssignedXidsValid;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +387,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +412,49 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(.name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(.name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(.name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index f1ab3aa3fe0..adebf0e7898 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -106,7 +107,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -114,51 +124,39 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(.name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..37a21ffaf1a 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
Size size;
@@ -223,26 +233,18 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(.name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..a05c55b534e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,23 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *AllProcsShmemPtr;
+static void *FastPathLockArrayShmemPtr;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static uint32 TotalProcs;
+static size_t ProcGlobalAllProcsShmemSize;
+static size_t FastPathLockArrayShmemSize;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -83,32 +98,12 @@ static DeadLockState CheckDeadLock(void);
/*
- * Report shared-memory space needed by PGPROC.
+ * Calculate shared-memory space needed by Fast-Path locks.
*/
static Size
-PGProcShmemSize(void)
+CalculateFastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
-/*
- * Report shared-memory space needed by Fast-Path locks.
- */
-static Size
-FastPathLockShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +123,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +136,67 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ProcGlobalAllProcsShmemSize = size;
+ ShmemRequestStruct(.name = "PGPROC structures",
+ .size = ProcGlobalAllProcsShmemSize,
+ .ptr = &AllProcsShmemPtr,
+ );
+
+ if (!IsUnderPostmaster)
+ size = FastPathLockArrayShmemSize = CalculateFastPathLockShmemSize();
+ else
+ size = SHMEM_ATTACH_UNKNOWN_SIZE;
+ ShmemRequestStruct(.name = "Fast-Path Lock Array",
+ .size = size,
+ .ptr = &FastPathLockArrayShmemPtr,
+ );
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ ShmemRequestStruct(.name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .ptr = (void **) &ProcGlobal,
+ );
+
+ /* Let the semaphore implementation register its shared memory needs */
+ PGSemaphoreShmemRequest(ProcGlobalSemas());
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +215,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +244,11 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ ptr = AllProcsShmemPtr;
+ requestSize = ProcGlobalAllProcsShmemSize;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +257,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,30 +272,26 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemPtr;
+ requestSize = FastPathLockArrayShmemSize;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
+ /* Initialize semaphores */
+ PGSemaphoreInit(ProcGlobalSemas());
for (i = 0; i < TotalProcs; i++)
{
@@ -405,7 +412,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +467,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +600,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +669,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index d49a7a92c64..81199edca86 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -338,7 +338,8 @@ string_compare(const char *key1, const char *key2, Size keysize)
* under info->hcxt rather than under TopMemoryContext; the default
* behavior is only suitable for session-lifespan hash tables.
* Other flags bits are special-purpose and seldom used, except for those
- * associated with shared-memory hash tables, for which see ShmemInitHash().
+ * associated with shared-memory hash tables, for which see
+ * ShmemRequestHash().
*
* Fields in *info are read only when the associated flags bit is set.
* It is not necessary to initialize other fields of *info.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 66facc6907a..fe50ee505ba 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -37,11 +37,11 @@ typedef HANDLE PGSemaphore;
#endif
-/* Report amount of shared memory needed */
-extern Size PGSemaphoreShmemSize(int maxSemas);
+/* Request shared memory needed for semaphores */
+extern void PGSemaphoreShmemRequest(int maxSemas);
/* Module initialization (called during postmaster start or shmem reinit) */
-extern void PGReserveSemaphores(int maxSemas);
+extern void PGSemaphoreInit(int maxSemas);
/* Allocate a PGSemaphore structure with initial count 1 */
extern PGSemaphore PGSemaphoreCreate(void);
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 22822fc68d7..3e1d1fad5f9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,8 +552,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index cc4f26aa33d..7f855971b5a 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -67,9 +67,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index f0cf01f5a85..d62c29f1361 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -27,4 +27,19 @@
*/
PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
-/* TODO: nothing else for now */
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405_2-0008-refactor-predicate.c-inline-SerialInit-to-.patch (3.6K, 9-v20260405_2-0008-refactor-predicate.c-inline-SerialInit-to-.patch)
download | inline diff:
From a2a21524661c43955b799323295bec40e9cd3323 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 19 Mar 2026 17:21:30 +0200
Subject: [PATCH v20260405 08/15] refactor predicate.c: inline SerialInit to
the caller
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 73 +++++++++++-----------------
1 file changed, 29 insertions(+), 44 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e003fa5b107..13a6a4b93a6 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1355,7 +1312,35 @@ PredicateLockShmemInit(void)
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+
+ Assert(found == IsUnderPostmaster);
+ if (!found)
+ {
+ /*
+ * Set control information to reflect empty SLRU.
+ */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
+ }
}
/*
--
2.34.1
[text/x-patch] v20260405_2-0010-Convert-SLRUs-to-use-the-new-interface.patch (84.8K, 10-v20260405_2-0010-Convert-SLRUs-to-use-the-new-interface.patch)
download | inline diff:
From 61dd62447c3f259dbe291ecde5dccbda39cbb289 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:32:45 +0300
Subject: [PATCH v20260405 10/15] Convert SLRUs to use the new interface
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
---
src/backend/access/transam/clog.c | 55 ++--
src/backend/access/transam/commit_ts.c | 85 +++---
src/backend/access/transam/multixact.c | 138 +++++----
src/backend/access/transam/slru.c | 366 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 115 ++++---
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 266 +++++++---------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 112 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem_internal.h | 1 +
src/include/storage/subsystemlist.h | 10 +
src/test/modules/test_slru/test_slru.c | 106 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 691 insertions(+), 664 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..7cd1a56201f 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(.desc = &XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
@@ -827,7 +844,7 @@ check_transaction_buffers(int *newval, void **extra, GucSource source)
/*
* This func must be called ONCE on system install. It creates
* the initial CLOG segment. (The CLOG directory is assumed to
- * have been created by initdb, and CLOGShmemInit must have been
+ * have been created by initdb, and CLOGShmemInit must have been XXX
* called already.)
*/
void
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..2625cbf93bf 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,7 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +126,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +522,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +548,36 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(.desc = &CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
+ .nslots = CommitTsShmemBuffers(),
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
+
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
+
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..62d58da4abc 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,80 +1758,81 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(.name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(.desc = &MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(.desc = &MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+ /*
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
+ */
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ /*
+ * Set up array pointers.
+ */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
/*
* Set up array pointers.
*/
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..47dd52d6749 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -70,7 +70,9 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +91,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +103,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +112,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +178,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +198,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +240,135 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = (SlruShared) location;
+ desc->shared = shared;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) location;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +394,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +447,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +463,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +489,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +547,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +651,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +698,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +778,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +792,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +850,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +922,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1054,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1094,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1109,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1124,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1137,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1216,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1308,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1320,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1369,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1439,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1455,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1477,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1505,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1550,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1573,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1650,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1671,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1690,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1747,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1762,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1778,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1794,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1808,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1841,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1849,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1863,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1881,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..b79e648b899 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index e91a62ff42a..db6a9a6561b 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,63 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(.name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(.desc = &NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 4f707158303..7a8c69de802 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
@@ -123,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
@@ -270,10 +264,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -281,11 +271,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -313,7 +298,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 29ff6065dda..bc186d6ea17 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -134,6 +134,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -549,6 +550,9 @@ InitShmemIndexEntry(ShmemRequest *request)
case SHMEM_KIND_HASH:
shmem_hash_init(structPtr, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(structPtr, request->options);
+ break;
}
}
@@ -602,6 +606,9 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
case SHMEM_KIND_HASH:
shmem_hash_attach(index_entry->location, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(index_entry->location, request->options);
+ break;
}
return true;
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index af03071a71f..9c389b23506 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,8 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static int64 max_serializable_xacts;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +456,12 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,71 +1113,53 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
-
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
-
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ ShmemRequestHash(.name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "PREDICATELOCK hash",
+ .nelems = max_predicate_locks,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1177,29 +1172,27 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
-
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "SERIALIZABLEXID hash",
+ .nelems = max_serializable_xacts,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1214,58 +1207,50 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
+ ShmemRequestStruct(.name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
-
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup proceed to initialize the
- * shared memory.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1277,7 +1262,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1319,6 +1303,9 @@ PredicateLockShmemInit(void)
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1335,57 +1322,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..36a7514d7a0 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,28 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruDesc SlruDesc;
+
+typedef struct SlruOpts
{
- SlruShared shared;
+ ShmemStructOpts base;
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ /*
+ * name of SLRU. (This is user-visible, pick with care!)
+ */
+ const char *name;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * Pointer to a backend-private handle for the SLRU. It is initialized in
+ * when the SLRU is initialized or attached to.
*/
- bool long_segment_names;
+ SlruDesc *desc;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +136,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +172,26 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
-typedef SlruCtlData *SlruCtl;
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ SlruOpts options;
+
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +200,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(const SlruOpts *options);
+
+#define SimpleLruRequest(...) \
+ SimpleLruRequestWithOpts(&(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(void *location, ShmemStructOpts *options);
+extern void shmem_slru_attach(void *location, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
index fe12bf33439..7b259d33ccf 100644
--- a/src/include/storage/shmem_internal.h
+++ b/src/include/storage/shmem_internal.h
@@ -21,6 +21,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemRequestKind;
/* shmem.c */
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d62c29f1361..c199f18a27a 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -43,3 +50,6 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..40efffdbf62 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,60 +212,46 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
+void
+_PG_init(void)
{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
+ if (!process_shared_preload_libraries_in_progress)
+ ereport(ERROR,
+ (errmsg("cannot load \"%s\" after startup", "test_slru"),
+ errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
+ "test_slru")));
/*
* Create the SLRU directory if it does not exist yet, from the root of
* the data directory.
*/
- (void) MakePGDirectory(slru_dir_name);
+ (void) MakePGDirectory(TestSlruDir);
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
}
-void
-_PG_init(void)
+static void
+test_slru_shmem_request(void *arg)
{
- if (!process_shared_preload_libraries_in_progress)
- ereport(ERROR,
- (errmsg("cannot load \"%s\" after startup", "test_slru"),
- errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
- "test_slru")));
+ SimpleLruRequest(.desc = &TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 63c0b3a9465..3c35097361d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2901,9 +2901,9 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
+SlruOpts
SlruPageStatus
SlruScanCallback
SlruSegState
--
2.34.1
[text/x-patch] v20260405_2-0009-refactor-predicate.c-Move-all-the-initiali.patch (8.3K, 11-v20260405_2-0009-refactor-predicate.c-Move-all-the-initiali.patch)
download | inline diff:
From e9624d725947538f8e68497b58c62ca5c49cd6b2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 20 Mar 2026 20:27:50 +0200
Subject: [PATCH v20260405 09/15] refactor predicate.c: Move all the
initialization together
The ShmemInit function is very complicated currently. These
refactorings move it in a direction that is more natural with the new
shmem callbacks.
---
src/backend/storage/lmgr/predicate.c | 164 +++++++++++++--------------
1 file changed, 79 insertions(+), 85 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 13a6a4b93a6..af03071a71f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -1144,19 +1144,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1200,49 +1187,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1278,23 +1222,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1305,8 +1232,6 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
@@ -1328,19 +1253,88 @@ PredicateLockShmemInit(void)
*/
serialControl = (SerialControl)
ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
Assert(found == IsUnderPostmaster);
- if (!found)
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup proceed to initialize the
+ * shared memory.
+ */
+ if (IsUnderPostmaster)
{
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
}
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.34.1
[text/x-patch] v20260405_2-0011-Convert-AIO-to-the-new-interface.patch (14.6K, 12-v20260405_2-0011-Convert-AIO-to-the-new-interface.patch)
download | inline diff:
From d2826e5980e93c6ebd168f881385e47036e4f0b3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 12:43:16 +0200
Subject: [PATCH v20260405 11/15] Convert AIO to the new interface
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
---
src/backend/storage/aio/aio_init.c | 112 +++++++++++++---------
src/backend/storage/aio/method_io_uring.c | 39 ++++----
src/backend/storage/aio/method_worker.c | 84 +++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/include/storage/aio_internal.h | 16 +---
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 143 insertions(+), 117 deletions(-)
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..18bb4235044 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,12 +117,15 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
-
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -132,48 +143,52 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(.name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.request_fn_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +223,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.init_fn_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.attach_fn_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 9f76d2683c0..3295c59ed75 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,31 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +298,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index e24357a7a0a..eb636bf5ad9 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ pgaio_worker_set_secondary_ptr();
+
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ size_t size;
+ int queue_size;
+
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+
+ ShmemRequestStruct(.name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7a8c69de802..a510c928daa 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -122,7 +122,6 @@ CalculateShmemSize(void)
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
size = add_size(size, DataChecksumsShmemSize());
@@ -301,7 +300,6 @@ CreateOrAttachShmemStructs(void)
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index c199f18a27a..b438794d46d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -53,3 +53,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405_2-0012-Add-option-for-aligning-shmem-allocations.patch (4.0K, 13-v20260405_2-0012-Add-option-for-aligning-shmem-allocations.patch)
download | inline diff:
From 3b815d71e898dff859acf148a7f57c4200ff66d3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 23:44:15 +0200
Subject: [PATCH v20260405 12/15] Add option for aligning shmem allocations
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct.
---
src/backend/storage/ipc/shmem.c | 26 ++++++++++++++++----------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index bc186d6ea17..973811e545e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -239,7 +239,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -400,7 +400,8 @@ ShmemGetRequestedSize(void)
{
size = add_size(size, request->options->size);
/* calculate alignment padding like ShmemAllocRaw() does */
- size = CACHELINEALIGN(size);
+ size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
+ size);
}
return size;
@@ -525,7 +526,9 @@ InitShmemIndexEntry(ShmemRequest *request)
* We inserted the entry to the shared memory index. Allocate requested
* amount of shared memory for it, and initialize the index entry.
*/
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size,
+ request->options->alignment,
+ &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -754,7 +757,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -773,7 +776,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -783,8 +786,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -800,14 +804,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -821,8 +826,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 147a6915f7e..91218db6d6e 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -51,6 +51,12 @@ typedef struct ShmemStructOpts
*/
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
--
2.34.1
[text/x-patch] v20260405_2-0013-Convert-buffer-manager-to-the-new-API.patch (15.6K, 14-v20260405_2-0013-Convert-buffer-manager-to-the-new-API.patch)
download | inline diff:
From 0d52ad50c8c5d96d87c7e9ee37e9e4122b6196eb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Thu, 2 Apr 2026 00:44:02 +0300
Subject: [PATCH v20260405 13/15] Convert buffer manager to the new API
---
src/backend/storage/buffer/buf_init.c | 149 ++++++++++---------------
src/backend/storage/buffer/buf_table.c | 54 +++++----
src/backend/storage/buffer/freelist.c | 93 +++++----------
src/backend/storage/ipc/ipci.c | 3 -
src/include/storage/buf_internals.h | 5 -
src/include/storage/bufmgr.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 124 insertions(+), 187 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..1407c930c56 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,31 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ ShmemRequestStruct(.name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(.name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(.name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +104,50 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(.name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index d04ef74b850..347bf267d73 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -22,6 +22,7 @@
#include "postgres.h"
#include "storage/buf_internals.h"
+#include "storage/subsystems.h"
/* entry for buffer lookup hashtable */
typedef struct
@@ -32,37 +33,42 @@ typedef struct
static HTAB *SharedBufHash;
+static void BufTableShmemRequest(void *arg);
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
+const ShmemCallbacks BufTableShmemCallbacks = {
+ .request_fn = BufTableShmemRequest,
+ /* no special initialization needed, the hash table will start empty */
+};
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(void *arg)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ int size;
+
+ /*
+ * Request the shared buffer lookup hashtable.
+ *
+ * Since we can't tolerate running out of lookup table entries, we must be
+ * sure to specify an adequate table size here. The maximum steady-state
+ * usage is of course NBuffers entries, but BufferAlloc() tries to insert
+ * a new entry before deleting the old. In principle this could be
+ * happening in each partition concurrently, so we could need as many as
+ * NBuffers + NUM_BUFFER_PARTITIONS entries.
+ */
+ size = NBuffers + NUM_BUFFER_PARTITIONS;
+
+ ShmemRequestHash(.name = "Shared Buffer Lookup Table",
+ .nelems = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..fdb5bad7910 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,80 +379,35 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- request shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
-
- return size;
+ ShmemRequestStruct(.name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
}
/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
*/
-void
-StrategyInitialize(bool init)
+static void
+StrategyCtlShmemInit(void *arg)
{
- bool found;
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /*
- * Initialize the shared buffer lookup hashtable.
- *
- * Since we can't tolerate running out of lookup table entries, we must be
- * sure to specify an adequate table size here. The maximum steady-state
- * usage is of course NBuffers entries, but BufferAlloc() tries to insert
- * a new entry before deleting the old. In principle this could be
- * happening in each partition concurrently, so we could need as many as
- * NBuffers + NUM_BUFFER_PARTITIONS entries.
- */
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
-
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
-
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a510c928daa..f64c1d59fa3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -39,7 +39,6 @@
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
@@ -99,7 +98,6 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
@@ -263,7 +261,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- BufferManagerShmemInit();
/*
* Set up lock manager
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..89615a254a3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,7 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index b438794d46d..d8e11756a61 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -36,6 +36,9 @@ PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
--
2.34.1
[text/x-patch] v20260405_2-0014-Convert-all-remaining-subsystems-to-use-th.patch (110.5K, 15-v20260405_2-0014-Convert-all-remaining-subsystems-to-use-th.patch)
download | inline diff:
From 9c4e7d4d0db21f7a9a1db7dbf8fe255562398841 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 21 Mar 2026 19:05:26 +0200
Subject: [PATCH v20260405 14/15] Convert all remaining subsystems to use the
new API
---
src/backend/access/common/syncscan.c | 76 ++++----
src/backend/access/nbtree/nbtutils.c | 54 +++---
src/backend/access/transam/twophase.c | 75 ++++----
src/backend/access/transam/xlog.c | 82 +++++----
src/backend/access/transam/xlogprefetcher.c | 51 +++---
src/backend/access/transam/xlogrecovery.c | 35 ++--
src/backend/access/transam/xlogwait.c | 50 ++---
src/backend/postmaster/autovacuum.c | 79 ++++----
src/backend/postmaster/bgworker.c | 105 +++++------
src/backend/postmaster/checkpointer.c | 56 +++---
src/backend/postmaster/datachecksum_state.c | 41 ++---
src/backend/postmaster/pgarch.c | 43 +++--
src/backend/postmaster/walsummarizer.c | 60 +++---
src/backend/replication/logical/launcher.c | 56 +++---
src/backend/replication/logical/logicalctl.c | 29 ++-
src/backend/replication/logical/origin.c | 59 +++---
src/backend/replication/logical/slotsync.c | 41 +++--
src/backend/replication/slot.c | 64 +++----
src/backend/replication/walreceiverfuncs.c | 51 +++---
src/backend/replication/walsender.c | 59 +++---
src/backend/storage/ipc/ipci.c | 124 +------------
src/backend/storage/lmgr/lock.c | 113 +++++-------
src/backend/utils/activity/backend_status.c | 173 +++++++-----------
src/backend/utils/activity/pgstat_shmem.c | 158 ++++++++--------
src/backend/utils/activity/wait_event.c | 83 ++++-----
src/backend/utils/misc/injection_point.c | 57 +++---
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/datachecksum_state.h | 4 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 27 +++
src/include/utils/backend_status.h | 8 -
src/include/utils/injection_point.h | 3 -
src/include/utils/wait_event.h | 2 -
.../injection_points/injection_points.c | 59 ++----
src/test/modules/test_aio/test_aio.c | 107 +++++------
54 files changed, 933 insertions(+), 1206 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..0f9eb167bed 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,47 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ ShmemRequestStruct(.name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..014faa1622f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,37 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(.name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..836928180a9 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -187,8 +188,16 @@ typedef struct TwoPhaseStateData
GlobalTransaction prepXacts[FLEXIBLE_ARRAY_MEMBER];
} TwoPhaseStateData;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
static TwoPhaseStateData *TwoPhaseState;
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,10 +243,10 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
Size size;
@@ -248,46 +257,40 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(.name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e8999bbb61..bbc565509b0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -96,6 +96,7 @@
#include "storage/procsignal.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -571,6 +572,16 @@ typedef enum
WALINSERT_SPECIAL_CHECKPOINT
} WalInsertClass;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
static XLogCtlData *XLogCtl = NULL;
/* a private copy of XLogCtl->Insert.WALInsertLocks, for convenience */
@@ -579,6 +590,7 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
/*
@@ -5257,7 +5269,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
SetLocalDataChecksumState(ControlFile->data_checksum_version);
}
@@ -5274,10 +5287,10 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
Size size;
@@ -5317,23 +5330,24 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(.name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5351,36 +5365,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5442,6 +5437,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - set up WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..83a3f97a57c 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,25 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +337,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..c236e2b7969 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,20 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..2e31c0d67d7 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,34 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(.name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..250c43b85e5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3545,11 +3554,11 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
Size size;
@@ -3560,53 +3569,41 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(.name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 536aff7ca05..0992b9b6353 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -30,6 +30,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -110,6 +111,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -160,10 +169,10 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
Size size;
@@ -171,66 +180,58 @@ BackgroundWorkerShmemSize(void)
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..6b424ee610f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,11 +959,11 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
Size size;
@@ -967,39 +976,24 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(.name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index 76004bcedc6..eb7b01d0993 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -211,6 +211,7 @@
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -346,6 +347,7 @@ static volatile sig_atomic_t launcher_running = false;
static DataChecksumsWorkerOperation operation;
/* Prototypes */
+static void DataChecksumsShmemRequest(void *arg);
static bool DatabaseExists(Oid dboid);
static List *BuildDatabaseList(void);
static List *BuildRelationList(bool temp_relations, bool include_shared);
@@ -356,6 +358,10 @@ static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferA
static void launcher_cancel_handler(SIGNAL_ARGS);
static void WaitForAllTransactionsToFinish(void);
+const ShmemCallbacks DataChecksumsShmemCallbacks = {
+ .request_fn = DataChecksumsShmemRequest,
+};
+
/*****************************************************************************
* Functionality for manipulating the data checksum state in the cluster
*/
@@ -1236,35 +1242,16 @@ ProcessAllDatabases(void)
}
/*
- * DataChecksumStateSize
- * Compute required space for datachecksumsworker-related shared memory
- */
-Size
-DataChecksumsShmemSize(void)
-{
- Size size;
-
- size = sizeof(DataChecksumsStateStruct);
- size = MAXALIGN(size);
-
- return size;
-}
-
-/*
- * DataChecksumStateInit
- * Allocate and initialize datachecksumsworker-related shared memory
+ * DataChecksumShmemRequest
+ * Request datachecksumsworker-related shared memory
*/
-void
-DataChecksumsShmemInit(void)
+static void
+DataChecksumsShmemRequest(void *arg)
{
- bool found;
-
- DataChecksumState = (DataChecksumsStateStruct *)
- ShmemInitStruct("DataChecksumsWorker Data",
- DataChecksumsShmemSize(),
- &found);
- if (!found)
- MemSet(DataChecksumState, 0, DataChecksumsShmemSize());
+ ShmemRequestStruct(.name = "DataChecksumsWorker Data",
+ .size = sizeof(DataChecksumsStateStruct),
+ .ptr = (void **) &DataChecksumState,
+ );
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..0a1a1149d78 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,31 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- size = add_size(size, sizeof(PgArchData));
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
- return size;
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
}
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
{
- bool found;
-
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..20960f5b633 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,34 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ ShmemRequestStruct(.name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..9e75a3e04ee 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,11 +981,11 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
Size size;
@@ -987,7 +996,10 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(.name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1040,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..72f68ec58ef 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,13 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
-
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(.name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..372d77c475e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,48 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(.name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..d615ff8a81c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,26 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ ShmemRequestStruct(.name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..21a213a0ebf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,41 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..4e03e721872 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,46 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
-
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2bb3f34dc6d..ec39942bfc1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3765,47 +3774,37 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index f64c1d59fa3..bf6b81e621b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,41 +14,16 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/datachecksum_state.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/shmem_internal.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -57,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -97,33 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
- size = add_size(size, DataChecksumsShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -157,7 +103,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -204,9 +149,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -237,70 +179,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
- DataChecksumsShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitEventCustomShmemInit();
- InjectionPointShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 798c453ab38..68d5a0389df 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -409,6 +419,7 @@ PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
static uint32 proclock_hash(const void *key, Size keysize);
+
static void RemoveLocalLock(LOCALLOCK *locallock);
static PROCLOCK *SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
const LOCKTAG *locktag, uint32 hashcode, LOCKMODE lockmode);
@@ -432,21 +443,15 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_table_size;
- bool found;
/*
* Compute sizes for lock hashtables. Note that these calculations must
@@ -455,45 +460,48 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
/*
- * Allocate hash table for LOCK structs. This stores per-locked-object
+ * Hash table for LOCK structs. This stores per-locked-object
* information.
*/
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "LOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_FIXED_SIZE | HASH_PARTITION);
+ ShmemRequestHash(.name = "PROCLOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(.name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
/*
- * Allocate fast-path structures.
+ * FIXME: we used to do this in the size calculation:
+ *
+ * // Since NLOCKENTS is only an estimate, add 10% safety margin. size =
+ * add_size(size, size / 10);
*/
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+}
+
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3758,29 +3766,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /* fast-path structures */
- size = add_size(size, sizeof(FastPathStrongRelationLockData));
-
- return size;
-}
-
/*
* GetLockStatusData - Return a summary of the lock manager's internal
* status, for use in a user-level reporting function.
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..4cb9c80a2c5 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -18,7 +18,9 @@
#include "pgstat.h"
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
+#include "storage/shmem.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,97 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(.name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(.name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(.name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
#endif
+
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
- }
-
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
- {
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +177,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +191,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..955faf5ebc7 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -149,102 +157,98 @@ StatsShmemSize(void)
return sz;
}
+/*
+ * Register shared memory area for cumulative statistics
+ */
+static void
+StatsShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
+
/*
* Initialize cumulative statistics system during startup
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemInit(void *arg)
{
- bool found;
- Size sz;
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- Assert(!found);
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ if (!kind_info)
+ continue;
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
{
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
-
- if (!kind_info)
- continue;
-
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
-
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 2b76967776c..95635c7f56c 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -95,59 +96,47 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRequestStruct(.name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..a7c99e097ea 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,45 +232,34 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
+}
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
}
+#endif
/*
* Attach a new injection point.
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4af38e74ce4..437b4f32349 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,8 +259,6 @@ extern void InitLocalDataChecksumState(void);
extern void SetLocalDataChecksumState(uint32 data_checksum_version);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/datachecksum_state.h b/src/include/postmaster/datachecksum_state.h
index 343494edcc8..05625539604 100644
--- a/src/include/postmaster/datachecksum_state.h
+++ b/src/include/postmaster/datachecksum_state.h
@@ -17,10 +17,6 @@
#include "storage/procsignal.h"
-/* Shared memory */
-extern Size DataChecksumsShmemSize(void);
-extern void DataChecksumsShmemInit(void);
-
/* Possible operations the Datachecksumsworker can perform */
typedef enum DataChecksumsWorkerOperation
{
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d8e11756a61..5e092552c72 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,9 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
@@ -40,12 +43,18 @@ PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
+
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -53,9 +62,27 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(DataChecksumsShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..0f1af513673 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,23 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(.name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +584,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..35efba1a5e3 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,8 +74,15 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
@@ -83,70 +90,55 @@ static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(.name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -155,10 +147,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.34.1
[text/x-patch] v20260405_2-0015-resizable-shared-memory-structures.patch (66.5K, 16-v20260405_2-0015-resizable-shared-memory-structures.patch)
download | inline diff:
From 09de808f14d11d26c45016ab9d81e911ef1e444c Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH v20260405 15/15] resizable shared memory structures
Resizable shared memory structures can be allocated by specifying a new
member ShmemStructOpts::maximum_size. At the startup or when the
structure is created, we reserve address space worth maximum_size in the
shared memory segment. It is expected that the subsystem which creates
the structure would initialize only the initial size worth of memory
when creating it. In an mmap'ed memory, this should allocate memory
worth the initial size. It should not allocate maximum_size worth of
memory initially. As the structure is resized using ShmemResizeStruct()
memory is freed or allocated in chunks of memory pages when shrinking
and expanding the structure respectively.
Resizable shared memory feature depends upon existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check existence of madvise
separately, since existence of the constants implies existence of the
function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since that's
largely used for Windows where the APIs to free and allocate memory from
and to a given address space are not known to the author right now.
Given that PostgreSQL is used widely on Linux, providing this feature on
Linux covers benefits most of its users. Once we figure out the required
Windows APIs, we will support this feature on Windows as well.
The feature is also not available when Sys-V shared memory is used even
on Linux since we do not know whether required Sys-V APIs exist; mostly
they don't. Since that combination is only available for development and
testing, not supporting the feature there isn't going to impact
PostgreSQL users.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not support
the feature. But we also have run time checks to disable this feature
when Sys-V shared memory is used. In order to know whether a given
instance of running server supports resizable structures, we have
introduced GUC have_resizable_shmem.
Author: Ashutosh Bapat <[email protected]>
---
configure.ac | 4 +
doc/src/sgml/config.sgml | 15 +
doc/src/sgml/system-views.sgml | 30 +-
doc/src/sgml/xfunc.sgml | 54 +++
meson.build | 16 +
src/backend/port/sysv_shmem.c | 69 ++++
src/backend/port/win32_shmem.c | 23 ++
src/backend/storage/ipc/shmem.c | 269 ++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/guc_tables.c | 7 +
src/include/catalog/pg_proc.dat | 4 +-
src/include/pg_config.h.in | 8 +
src/include/pg_config_manual.h | 9 +
src/include/storage/pg_shmem.h | 5 +
src/include/storage/shmem.h | 15 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 25 ++
src/test/modules/resizable_shmem/meson.build | 36 ++
.../resizable_shmem/resizable_shmem--1.0.sql | 37 ++
.../modules/resizable_shmem/resizable_shmem.c | 332 ++++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 4 +
.../resizable_shmem/t/001_resizable_shmem.pl | 239 +++++++++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 23 ++
.../modules/test_shmem/test_shmem--1.0.sql | 4 +
src/test/modules/test_shmem/test_shmem.c | 20 ++
src/test/regress/expected/rules.out | 6 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 1228 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/configure.ac b/configure.ac
index ff5dd64468e..7acd844ccb2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1895,6 +1895,10 @@ AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
+# Linux-specific madvise constants needed for resizable shared memory. See similar checks in meson.build for explanation of why these checks are here.
+AC_CHECK_DECLS([MADV_POPULATE_WRITE], [], [], [#include <sys/mman.h>])
+AC_CHECK_DECLS([MADV_REMOVE], [], [], [#include <sys/mman.h>])
+
AC_REPLACE_FUNCS(m4_normalize([
explicit_bzero
getopt
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d3fea738ca3..a42a173445e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -12072,6 +12072,21 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-have-resizable-shmem" xreflabel="have_resizable_shmem">
+ <term><varname>have_resizable_shmem</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>have_resizable_shmem</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with <literal>HAVE_RESIZABLE_SHMEM</literal> enabled and supports
+ <link linkend="xfunc-shared-addin-resizable">Resizable shared memory structures</link>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status">
<term><varname>huge_pages_status</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2ebec6928d5..9717f8434bb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the resizable allocation can grow to. Zero for
+ fixed-size allocations, for anonymous allocations, and for free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>reserved_space</structfield> <type>int8</type>
+ </para>
+ <para>
+ Address space reserved for the allocation in bytes. For resizable
+ structures, this is the total address space reserved to accommodate
+ growth up to <structfield>maximum_size</structfield>, and is greater
+ than or equal to <structfield>allocated_size</structfield>. For
+ fixed-size allocations, anonymous allocations, and free memory this
+ is same as <structfield>allocated_size</structfield>.
</para></entry>
</row>
</tbody>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index aed3f2f0071..c2a0c5136af 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3748,6 +3748,60 @@ my_shmem_init(void *arg)
</para>
</sect3>
+ <sect3 id="xfunc-shared-addin-resizable">
+ <title>Resizable shared memory structures</title>
+
+ <para>
+ A resizable memory structure can be requested using
+ <function>ShmemRequestStruct</function> by passing
+ <parameter>.maximum_size</parameter> along with
+ <parameter>.size</parameter>. <parameter>.maximum_size</parameter> is
+ maximum size upto which the structure can grow where as
+ <parameter>.size</parameter> is the initial size of the structure. While
+ contiguous address space worth <parameter>maximum_size</parameter> is
+ allocated to the structure, only memory worth <parameter>size</parameter>
+ bytes is allocated initially. The <function>init_fn</function> should only
+ initialize the <parameter>size</parameter> amount of memory. The actual
+ memory allocated to this structure at any point in time is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_size</structfield></link>
+ and the address space reserved for this structure is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>reserved_space</structfield></link>.
+ </para>
+
+ <para>
+ The structure can be resized using <function>ShmemResizeStruct</function> by
+ passing it the structure's <structname>ShmemStructDesc</structname> and the
+ new size which can be anywhere between 0 to
+ <parameter>maximum_size</parameter>. If the new size is smaller than the
+ current size of the structure, the memory between the new size and current
+ size is freed while keeping the contents of the memory upto new size intact.
+ If the new size is greater than the current size, memory is allocated upto
+ new size while keeping the current contents of the structure intact. The
+ starting address of the structure does not change because of resizing
+ operation. The caller may need to take care of the additional
+ synchronization between the resizing process and the processes using the
+ shared structure. Also accessing the memory beyond the current size of the
+ structure will not cause any segmentation fault or a bus error. Memory will
+ be allocated during such a write access. 0s will be returned on such a read
+ access if memory is not allocated yet. The additional synchronization may
+ use mprotect() with PROT_NONE in every backend that may access this memory
+ to ensure that such an access results in a fault.
+ </para>
+
+ <para>
+ This functionality is available only on the platforms which provide the APIs
+ necessary to reserve contiguous address space and to allocate or free memory
+ in that address space on demand. Macro <symbol>HAVE_RESIZABLE_SHMEM</symbol>
+ is defined on such platforms. It can be used to guard code related to
+ resizing a shared memory structure. The functionality is available on with
+ mmap'ed memory, so subsystems which use resizable structures may have to
+ addtionally disable resizable memory usage when <symbol>shared_memory_type</symbol> is not
+ <symbol>SHMEM_TYPE_MMAP</symbol>. A GUC <xref linkend="guc-have-resizable-shmem"/> is set to
+ <literal>on</literal> when this functionality is available in a running
+ server, <literal>off</literal> otherwise.
+ </para>
+ </sect3>
+
<sect3 id="xfunc-shared-addin-dynamic">
<title>Allocating Dynamic Shared Memory After Startup</title>
diff --git a/meson.build b/meson.build
index 43d5ffc30b1..790845762e1 100644
--- a/meson.build
+++ b/meson.build
@@ -2904,6 +2904,22 @@ decl_checks = [
['timingsafe_bcmp', 'string.h'],
]
+# Linux-specific madvise constants needed for resizable shared memory.
+# Usually we use AC_CHECK_DECLS to check for function declarations, but in this
+# case we are using it to detect existence of constants. These constants are
+# used to define HAVE_RESIZABLE_SHMEM which is used in storage/pg_shmem.h as
+# well as storage/shmem.h. The first abstracts the APIs to allocate shared
+# memory segments from the operating system whereas the second abstracts APIs to
+# allocate shared memory to various subsystems. Since they are related but
+# orthogonal to each other, including any one of them in the other file doesn't
+# make sense. pg_config_manual.h is the only place where HAVE_RESIZABLE_SHMEM
+# can be defined and made available to both without including sys/mman.h. But
+# for that we need constants that indicate the existence of following defines.
+decl_checks += [
+ ['MADV_POPULATE_WRITE', 'sys/mman.h'],
+ ['MADV_REMOVE', 'sys/mman.h'],
+]
+
# Need to check for function declarations for these functions, because
# checking for library symbols wouldn't handle deployment target
# restrictions on macOS
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..8d859dfbbfb 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,51 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be freed")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not free shared memory: %m")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be allocated at runtime")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not allocate shared memory: %m")));
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..dc2ee018845 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -648,3 +648,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 973811e545e..4eae14bede7 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,11 +19,11 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * This module provides facilities to allocate fixed-size structures in shared
- * memory, for things like variables shared between all backend processes.
- * Each such structure has a string name to identify it, specified when it is
- * requested. shmem_hash.c provides a shared hash table implementation on top
- * of that.
+ * This module provides facilities to allocate fixed-size as well as resizable
+ * structures in shared memory, for things like variables shared between all
+ * backend processes. Each such structure has a string name to identify it,
+ * specified when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of fixed-size structures.
*
* Shared memory areas should usually not be allocated after postmaster
* startup, although we do allow small allocations later for the benefit of
@@ -106,6 +106,21 @@
* (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
* additional per-backend setup.
*
+ * Resizable shared memory structures
+ * ----------------------------------
+ *
+ * In order to allocate resizable shared memory structures, set
+ * ShmemRequestStructOpts::maximum_size to the maximum size that the structure
+ * can grow to. The address space for the maximum size will be reserved at
+ * startup, but memory is allocated or freed as the structure grows or shrinks
+ * respectively. ShmemRequestStructOpts::size should be set to the initial size
+ * of the structure, which is the amount of memory allocated at the startup.
+ * After startup, the structure can be resized by calling ShmemResizeStruct() by
+ * passing it the ShmemStructDesc for the structure and the new size.
+ *
+ * While resizable structures can be created after the startup, the memory
+ * available for them is quite limited.
+ *
* Legacy ShmemInitStruct()/ShmemInitHash() functions
* --------------------------------------------------
*
@@ -170,6 +185,18 @@ typedef struct
ShmemRequestKind kind;
} ShmemRequest;
+/*
+ * A convenient macro to get the space required for a shmem request consistently.
+ * A resizable structure, requested by non-zero maximum_size, requires space for
+ * its maximum size.
+ */
+#ifdef HAVE_RESIZABLE_SHMEM
+#define SHMEM_REQUEST_SPACE_SIZE(request) \
+ ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+#else
+#define SHMEM_REQUEST_SPACE_SIZE(request) ((request)->options->size)
+#endif
+
static List *pending_shmem_requests;
/*
@@ -272,6 +299,10 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size maximum_size; /* the maximum size the structure can grow to */
+ Size reserved_space; /* the total address space reserved */
+#endif
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -280,6 +311,9 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+#ifdef HAVE_RESIZABLE_SHMEM
+static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
+#endif
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -350,6 +384,11 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
else
{
@@ -358,8 +397,24 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->maximum_size < 0)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0 && options->size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
+ options->name, options->maximum_size, options->size);
+
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -379,8 +434,13 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
}
/*
- * ShmemGetRequestedSize() --- estimate the total size of all registered shared
- * memory structures.
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * When maximum_size is specified when requesting resizable shared memory
+ * structures. We use that, instead of the (initial) size, for the estimation,
+ * to ensure that enough space is reserved for growing the resizable structures
+ * to its maximum size.
*
* This is called once at postmaster startup, before the shared memory segment
* has been created.
@@ -398,7 +458,7 @@ ShmemGetRequestedSize(void)
/* memory needed for all the requested areas */
foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- size = add_size(size, request->options->size);
+ size = add_size(size, SHMEM_REQUEST_SPACE_SIZE(request));
/* calculate alignment padding like ShmemAllocRaw() does */
size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
size);
@@ -506,6 +566,7 @@ InitShmemIndexEntry(ShmemRequest *request)
ShmemIndexEnt *index_entry;
bool found;
size_t allocated_size;
+ size_t requested_size;
void *structPtr;
/* look it up in the shmem index */
@@ -523,10 +584,18 @@ InitShmemIndexEntry(ShmemRequest *request)
}
/*
- * We inserted the entry to the shared memory index. Allocate requested
- * amount of shared memory for it, and initialize the index entry.
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of address space in the shared memory segment for it, and do
+ * basic initializion. The memory gets allocated during initialization as
+ * the corresponding memory pages are written to. Allocate enough space
+ * for a resizable structure to grow to its maximum size. It is expected
+ * that the initialization callback will use only as much memory as the
+ * initial size of the resizable structure. (Well, if it doesn't, more
+ * memory will be allocated initially than expected, no further harm is
+ * done.)
*/
- structPtr = ShmemAllocRaw(request->options->size,
+ requested_size = SHMEM_REQUEST_SPACE_SIZE(request);
+ structPtr = ShmemAllocRaw(requested_size,
request->options->alignment,
&allocated_size);
if (structPtr == NULL)
@@ -535,13 +604,22 @@ InitShmemIndexEntry(ShmemRequest *request)
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
+ errmsg("not enough shared memory space for data structure"
" \"%s\" (%zu bytes requested)",
- name, request->options->size)));
+ name, requested_size)));
}
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
+#ifdef HAVE_RESIZABLE_SHMEM
+ index_entry->reserved_space = allocated_size;
+ index_entry->maximum_size = request->options->maximum_size;
+ if (request->options->maximum_size > 0)
+ {
+ /* Adjust allocated size of a resizable structure. */
+ index_entry->allocated_size = EstimateAllocatedSize(index_entry);
+ }
+#endif
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -586,7 +664,7 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the size in the index matches the request. */
+ /* Check that the sizes in the index match the request. */
if (index_entry->size != request->options->size &&
request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -596,6 +674,18 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+#endif
+
/*
* Re-establish the caller's pointer variable, or do other actions to
* attach depending on the kind of shmem area it is.
@@ -617,6 +707,115 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Estimate the actual memory allocated for a resizable structure.
+ *
+ * ... based on the assumption that the memory is allocated in pages.
+ *
+ * The memory pages covered by the current size of a resizable structure are
+ * fully allocated when the currently allocated part of the structure is written
+ * to. The memory page where the maximal structure ends also hosts the next
+ * structure, unless the maximal structure ends on a page boundary. Hence that
+ * page is allocated when the next structure is written to. The memory pages
+ * between the page where the current structure ends and the page where the next
+ * structure starts remain unallocated. Thus the memory allocated for a
+ * resizable structure can be estimated as the total address space reserved for
+ * the structure minus the unallocated memory pages between the current end and
+ * the next structure.
+ */
+static Size
+EstimateAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ Assert(entry->maximum_size >= entry->size);
+ Assert(entry->reserved_space >= entry->maximum_size);
+
+ if (align_end < floor_max_end)
+ return entry->reserved_space - (floor_max_end - align_end);
+
+ return entry->reserved_space;
+}
+
+/*
+ * ShmemResizeStruct() --- resize a resizable shared memory structure.
+ *
+ * If the structure is being shrunk, the memory pages that are no longer needed
+ * are freed. If the structure is being expanded, the memory pages that are
+ * needed for the new size are allocated. See EstimateAllocatedSize() for
+ * explanation of which pages are allocated for a resizable structure.
+ */
+void
+ShmemResizeStruct(const char *name, Size new_size)
+{
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /*
+ * Resizable shared memory structures are only supported with mmap'ed
+ * memory.
+ */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ Assert(result);
+
+ if (result->maximum_size <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\"" \
+ "(required %zu bytes, reserved %zu bytes)",
+ name, new_size, result->maximum_size)));
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the
+ * start of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+
+ if (max_end > new_end)
+ PGSharedMemoryEnsureFreed(new_end, max_end - new_end);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+
+ if (new_end > struct_start)
+ PGSharedMemoryEnsureAllocated(struct_start, new_end - struct_start);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ result->allocated_size = EstimateAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -723,6 +922,10 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(!found);
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
+#ifdef HAVE_RESIZABLE_SHMEM
+ result->maximum_size = 0;
+ result->reserved_space = result->allocated_size;
+#endif
result->location = ShmemAllocator->index;
}
}
@@ -1064,7 +1267,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 6
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1086,7 +1289,23 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
+#ifdef HAVE_RESIZABLE_SHMEM
+ values[4] = Int64GetDatum(ent->maximum_size);
+ values[5] = Int64GetDatum(ent->reserved_space);
+
+ /*
+ * Keep track of the total reserved space for named shmem areas, to be
+ * able to calculate the amount of shared memory allocated for
+ * anonymous areas and the amount of free shared memory at the end of
+ * the segment.
+ */
+ named_allocated += ent->reserved_space;
+#else
+ values[4] = Int64GetDatum(0);
+ values[5] = Int64GetDatum(ent->allocated_size);
+
named_allocated += ent->allocated_size;
+#endif
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -1097,6 +1316,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1105,6 +1326,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -1292,23 +1515,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a315c4ab8ab..b4d98a1f610 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1211,6 +1211,13 @@
max => '1000.0',
},
+{ name => 'have_resizable_shmem', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+ short_desc => 'Shows whether the running server supports resizable shared memory.',
+ flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE',
+ variable => 'have_resizable_shmem_enabled',
+ boot_val => 'HAVE_RESIZABLE_SHMEM_ENABLED',
+},
+
{ name => 'hba_file', type => 'string', context => 'PGC_POSTMASTER', group => 'FILE_LOCATIONS',
short_desc => 'Sets the server\'s "hba" configuration file.',
flags => 'GUC_SUPERUSER_ONLY',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d9ca13baff9..6bb08dd10f1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -653,6 +653,13 @@ static bool assert_enabled = DEFAULT_ASSERT_ENABLED;
#endif
static bool exec_backend_enabled = EXEC_BACKEND_ENABLED;
+#ifdef HAVE_RESIZABLE_SHMEM
+#define HAVE_RESIZABLE_SHMEM_ENABLED true
+#else
+#define HAVE_RESIZABLE_SHMEM_ENABLED false
+#endif
+static bool have_resizable_shmem_enabled = HAVE_RESIZABLE_SHMEM_ENABLED;
+
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bd177aebfcb..e575d70b572 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8664,8 +8664,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,maximum_size,reserved_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9f6d512347e..8f2a59ec3a8 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -85,6 +85,14 @@
don't. */
#undef HAVE_DECL_F_FULLFSYNC
+/* Define to 1 if you have the declaration of `MADV_POPULATE_WRITE', and to 0
+ if you don't. */
+#undef HAVE_DECL_MADV_POPULATE_WRITE
+
+/* Define to 1 if you have the declaration of `MADV_REMOVE', and to 0 if you
+ don't. */
+#undef HAVE_DECL_MADV_REMOVE
+
/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
don't. */
#undef HAVE_DECL_MEMSET_S
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 521b49b8888..b09d6c91324 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -131,6 +131,15 @@
#define EXEC_BACKEND
#endif
+/*
+ * HAVE_RESIZABLE_SHMEM indicates whether resizable shared memory structures are
+ * supported. The implementation requires Linux-specific madvise constants
+ * (MADV_REMOVE and MADV_POPULATE_WRITE).
+ */
+#if HAVE_DECL_MADV_REMOVE && HAVE_DECL_MADV_POPULATE_WRITE && !defined(EXEC_BACKEND)
+#define HAVE_RESIZABLE_SHMEM
+#endif
+
/*
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
* posix_fadvise() kernel call. Usually the automatic configure tests are
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..3d5aceba59c 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,11 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+#endif
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 91218db6d6e..122bf7943ca 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,18 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for large resizable structures which need several pages
+ * worth of contiguous memory.
+ */
+ ssize_t maximum_size;
+#endif
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
@@ -166,6 +178,9 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void ShmemResizeStruct(const char *name, Size new_size);
+#endif
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index f1b04c99969..2a1e746bf0c 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index fc99552d9ab..cd94e1fea15 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..86bf17bef4a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,25 @@
+# src/test/modules/resizable_shmem/Makefile
+
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+MODULES = resizable_shmem
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..c1bcb6117b6
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,37 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..970ea75c0de
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,332 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "commands/extension.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/* Default values for the GUCs controlling structure size */
+#define TEST_INITIAL_ENTRIES_DEFAULT (25 * 1024 * 1024) /* ~100MB */
+#define TEST_MAX_ENTRIES_DEFAULT (100 * 1024 * 1024) /* ~400MB */
+
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * The test performs resizing, reads or writes, only one at a time and never
+ * concurrently. Hence, there is no need for locks in the test structure.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+/* GUC variables controlling the size of the test structure */
+static int test_initial_entries;
+static int test_max_entries;
+
+/* Whether to use SHMEM_ATTACH_UNKNOWN_SIZE when attaching to the shared memory */
+static bool use_unknown_size = false;
+
+static void resizable_shmem_request(void *arg);
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemCallbacks shmem_callbacks = {
+ .request_fn = resizable_shmem_request,
+ .init_fn = resizable_shmem_shmem_init,
+};
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ int guc_context;
+
+ /*
+ * Use PGC_POSTMASTER when loaded at startup so the values are fixed once
+ * the shared memory segment is created. When loaded after startup
+ * PGC_POSTMASTER is not allowed, so we use PGC_SIGHUP instead. Although
+ * we do not intend to change these values at config reload, PGC_SIGHUP is
+ * the least permissive context that allows defining the GUC after startup
+ * and still prevents it from being changed via SET.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ guc_context = PGC_POSTMASTER;
+ else
+ {
+ guc_context = PGC_SIGHUP;
+ shmem_callbacks.flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP;
+ }
+
+ DefineCustomIntVariable("resizable_shmem.initial_entries",
+ "Initial number of entries in the test structure.",
+ NULL,
+ &test_initial_entries,
+ TEST_INITIAL_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ DefineCustomIntVariable("resizable_shmem.max_entries",
+ "Maximum number of entries in the test structure.",
+ NULL,
+ &test_max_entries,
+ TEST_MAX_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ /*
+ * When loaded after startup by a backend that is not creating the
+ * extension, the shared memory might have been resized to a size other
+ * than the initial size. Use SHMEM_ATTACH_UNKNOWN_SIZE to attach without
+ * knowing the exact size.
+ */
+ if (!process_shared_preload_libraries_in_progress && !creating_extension)
+ use_unknown_size = true;
+
+ RegisterShmemCallbacks(&shmem_callbacks);
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void *arg)
+{
+ Size initial_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_max_entries, TEST_ENTRY_SIZE));
+
+ /* A preprocessor macro to conditionally include the maximum_size field. */
+#define MAXIMUM_SIZE_ARG .maximum_size = max_size,
+#else
+#define MAXIMUM_SIZE_ARG
+#endif
+
+ /* Register our resizable shared memory structure */
+ ShmemRequestStruct(.name = "resizable_shmem",
+ .size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+ MAXIMUM_SIZE_ARG
+ .ptr = (void **) &resizable_shmem,
+ );
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+ resizable_shmem->num_entries = test_initial_entries;
+ memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(new_entries, TEST_ENTRY_SIZE));
+ ShmemResizeStruct("resizable_shmem", new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * The function assumes that these values will be available in
+ * /proc/self/status, any system which also support madvise with MADV_REMOVE and
+ * MADV_POPULATE_WRITE.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ FILE *f;
+ char line[256];
+ int64 rss_anon_kb = -1;
+ int64 rss_file_kb = -1;
+ int64 rss_shmem_kb = -1;
+ int64 vm_size_kb = -1;
+ int found = 0;
+ TupleDesc tupdesc;
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+
+ /* Open /proc/self/status to read memory information */
+ f = fopen("/proc/self/status", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/status: %m")));
+
+ /* Look for the memory usage lines */
+ while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ {
+ if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+ found++;
+ else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+ found++;
+ else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+ found++;
+ else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+ found++;
+ }
+
+ fclose(f);
+
+ /* Build tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept a record")));
+
+ /* Build the result tuple */
+ values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+ values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+ values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+ values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..8031303fe0e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,4 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..6d45b1eccdc
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,239 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality, both when loaded at startup via
+# shared_preload_libraries and when loaded after startup (late allocation).
+
+# Verify that RssShmem does not exceed the total allocated shared memory.
+# Allocated shared memory should be mostly the memory allocated to the
+# resizable_shmem structure. Any large increase in expected RssShmem should
+# reflect the unexpected increase in memory allocated to the resizable_shmem
+# structure.
+sub check_shmem_usage
+{
+ my ($session, $label, $node) = @_;
+
+ my $rss_shmem = $session->query_safe('SELECT rss_shmem FROM resizable_shmem_usage();',
+ verbose => 0);
+ my $total_alloc = $node->safe_psql('postgres',
+ "SELECT sum(allocated_size) FROM pg_shmem_allocations;");
+
+ note "$label: RssShmem=$rss_shmem, sum(allocated_size)=$total_alloc";
+ ok($rss_shmem <= $total_alloc, "$label: RssShmem does not exceed total allocated size");
+}
+
+# Test a resize operation: resize, verify old data, write new data, verify
+# new data, and check shmem usage. Returns updated ($num_entries, $value).
+sub test_resize
+{
+ my ($node, $prefix, $old_num_entries, $old_value, $new_num_entries, $new_value, $label) = @_;
+
+ $label = "$prefix: $label";
+
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $session1->query_safe("SELECT resizable_shmem_resize($new_num_entries);",
+ verbose => 0);
+
+ # Old data should still be intact in the (possibly smaller) area
+ my $readable_entries = ($new_num_entries < $old_num_entries) ? $new_num_entries : $old_num_entries;
+ is($session1->query_safe("SELECT resizable_shmem_read($readable_entries, $old_value);",
+ verbose => 0),
+ 't', "old data readable after $label");
+
+ $session2->query_safe("SELECT resizable_shmem_write($new_value);",
+ verbose => 0);
+ is($session1->query_safe("SELECT resizable_shmem_read($new_num_entries, $new_value);",
+ verbose => 0),
+ 't', "new data readable after $label");
+
+ check_shmem_usage($session1, "$label (session 1)", $node);
+ check_shmem_usage($session2, "$label (session 2)", $node);
+
+ $session1->quit;
+ $session2->quit;
+
+ return ($new_num_entries, $new_value);
+}
+
+# Run the full suite of resizable shared memory tests on the given node.
+sub run_resizable_tests
+{
+ my ($node, $initial_entries, $max_entries, $prefix) = @_;
+
+ my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+ my $num_entries = $initial_entries;
+
+ # Basic read/write should work on all platforms
+ my $value = 100;
+ $node->safe_psql('postgres', "SELECT resizable_shmem_write($value);");
+ is($node->safe_psql('postgres', "SELECT resizable_shmem_read($num_entries, $value);"),
+ 't', "$prefix: data read after write successful");
+
+ if ($have_resizable_shmem)
+ {
+ # Initial structure state
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $value = 100;
+ # Write and read the initial set of entries.
+ $session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+ is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);",
+ verbose => 0),
+ 't', "$prefix: data read after write successful");
+ check_shmem_usage($session1, "$prefix: initial write (session 1)", $node);
+ check_shmem_usage($session2, "$prefix: initial write (session 2)", $node);
+ $session1->quit;
+ $session2->quit;
+
+ # Verify no other structure is resizable
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> 0;"),
+ '0', "$prefix: no other resizable structures");
+
+ # Resize to maximum
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $max_entries, 500, 'resize to maximum');
+
+ # Shrink to 75% of max
+ my $shrink_entries = int($max_entries * 3 / 4);
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $shrink_entries, 999, 'shrinking');
+
+ # Resize to the same size (no-op)
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $num_entries, 1999, 'no-op resize');
+
+ # Test resize failure (attempt to resize beyond max - should fail)
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ ok($ret != 0 || $stderr =~ /ERROR/, "$prefix: Resize beyond maximum fails");
+ }
+ else
+ {
+ # On unsupported platforms, resizing should fail with a clear error
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize($num_entries);");
+ ok($ret != 0, "$prefix: resize fails on unsupported platform");
+ like($stderr, qr/not supported/, "$prefix: resize error mentions not supported");
+ }
+}
+
+### Set up a test node.
+#
+#Configure minimal shared memory so that the resizable_shmem structure dominates
+#and any unexpected increase is easy to detect.
+#
+# Also disable huge pages so that RssShmem and allocated_size are comparable.
+# The latter is already aligned to the default page size.
+###
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+$node->init;
+
+$node->append_conf('postgresql.conf', 'huge_pages = off');
+$node->append_conf('postgresql.conf', 'shared_buffers = 128kB');
+$node->append_conf('postgresql.conf', 'max_connections = 5');
+$node->append_conf('postgresql.conf', 'max_worker_processes = 0');
+$node->append_conf('postgresql.conf', 'max_wal_senders = 0');
+$node->append_conf('postgresql.conf', 'max_prepared_transactions = 0');
+$node->append_conf('postgresql.conf', 'max_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'max_pred_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'wal_buffers = 32kB');
+
+###
+# Test 1: Startup allocation via shared_preload_libraries
+###
+my $startup_initial = 25 * 1024 * 1024;
+my $startup_max = 100 * 1024 * 1024;
+
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->append_conf('postgresql.conf', "resizable_shmem.initial_entries = $startup_initial");
+$node->append_conf('postgresql.conf', "resizable_shmem.max_entries = $startup_max");
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $startup_initial, $startup_max, 'startup');
+
+my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+###
+# Test 2: Late allocation (loaded after startup, not in shared_preload_libraries).
+# Use much smaller sizes since only ~100KB of shared memory is available for
+# structures allocated after startup.
+###
+my $late_initial = 5 * 1024;
+my $late_max = 12 * 1024;
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM RESET shared_preload_libraries;
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $late_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $late_max;
+});
+$node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+$node->restart;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $late_initial, $late_max, 'late');
+
+###
+# Test sysv shared memory does not support resizable shmem. Only relevant on
+# platforms that support resizable shmem (HAVE_RESIZABLE_SHMEM), since the
+# module only sets maximum_size in that case.
+###
+if ($have_resizable_shmem)
+{
+ ###
+ # Test 3: Verify that CREATE EXTENSION fails with sysv shared memory
+ # when loaded after startup (not in shared_preload_libraries).
+ ###
+ $node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+
+ # Remove settings that would cause the library to auto-load at startup:
+ # shared_preload_libraries and module-prefixed GUCs. ALTER SYSTEM RESET
+ # only affects postgresql.auto.conf, so we must use adjust_conf to remove
+ # from postgresql.conf.
+ $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.max_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.max_entries', undef);
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_memory_type = 'sysv';
+ });
+
+ $node->restart;
+
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+ ok($ret != 0, 'CREATE EXTENSION fails with resizable shmem on sysv');
+ like($stderr, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'CREATE EXTENSION error mentions shared_memory_type = mmap requirement');
+
+ ###
+ # Test 4: Verify that resizable structures are also rejected with sysv
+ # shared memory when loaded at startup via shared_preload_libraries.
+ ###
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_preload_libraries = 'resizable_shmem';
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $startup_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $startup_max;
+ });
+ $node->stop;
+
+ ok(!$node->start(fail_ok => 1),
+ 'server fails to start with resizable shmem on sysv');
+
+ my $log = slurp_file($node->logfile);
+ like($log, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'log mentions shared_memory_type = mmap requirement');
+}
+
+done_testing();
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c154f57682a..c89b140871f 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -45,5 +45,28 @@ else
ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
}
+###
+# Test that a fixed-size shared memory structure cannot be resized.
+# Only relevant on platforms that support resizable shmem.
+###
+my $have_resizable_shmem =
+ $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+if ($have_resizable_shmem)
+{
+ # Try expanding the fixed-size structure
+ my ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1000);");
+ isnt($ret, 0, "expanding a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "expand error message mentions not resizable");
+
+ # Try shrinking the fixed-size structure
+ ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1);");
+ isnt($ret, 0, "shrinking a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
+}
+
$node->stop;
+
done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
index 2d01fd9256c..e169d0d7733 100644
--- a/src/test/modules/test_shmem/test_shmem--1.0.sql
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -7,3 +7,7 @@
CREATE FUNCTION get_test_shmem_attach_count()
RETURNS pg_catalog.int4 STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION test_shmem_resize_fixed(pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 9bd4012b435..fc2fd67887f 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -99,3 +99,23 @@ get_test_shmem_attach_count(PG_FUNCTION_ARGS)
elog(ERROR, "shmem area not yet initialized");
PG_RETURN_INT32(TestShmem->attach_count);
}
+
+/*
+ * Attempt to resize the fixed-size shared memory structure. This should
+ * fail because the structure was not allocated with a maximum_size.
+ */
+PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
+Datum
+test_shmem_resize_fixed(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_size = PG_GETARG_INT32(0);
+
+ ShmemResizeStruct("test_shmem area", new_size);
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81a73c426d2..2bbbf48c96a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,10 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ maximum_size,
+ reserved_space
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size, reserved_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3c35097361d..81bf12cbc1a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3146,6 +3146,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestResizableShmemStruct
TestShmemData
TestSpec
TestValueType
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 14:08 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 14:08 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
<[email protected]> wrote:
>
>
> I will post my resizable shmem structures patch in a separate email in
> this thread but continue to review your patches.
I reviewed the SLRU patch. This is the first time I am looking at SLRU
code, so my review may not be sufficient. As far as I understand, the
patch faithfully copies the functionality from the old system to the
new system. I didn't find any issues there.
I think calls to SimpleLruRequest() reads much better than SimpleLruInit().
Both MultiXactShmemInit and MultiXactShmemAttach set
OldestMemberMXactId, OldestVisibleMXactId. In future if we add another
global variable to point to the shared memory, somebody needs to
remember to initialize it in both these functions. Maybe deduplicate
it with something like attached? Similarly for PredicateLock related
changes.
shmem_slru_init and shmem_slru_attach() also have the following
duplicate lines, which can be deduplicated in a similar fashion.
desc->shared = shared;
desc->nbanks = nbanks;
memcpy(&desc->options, options, sizeof(SlruOpts));
Including "access/slru.h" in shmem.h is circular inclusion. I am
wondering whether we need to create shmem_slru.h like shmem_hash.h to
handle shared memory APIs related to SLRU. Given that SLRU also has a
disk component, the bifurcation may not be straightforward. I haven't
looked into this aspect in detail.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] 0014_edits.diff.nocibot (696B, 2-0014_edits.diff.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 14:16 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 14:16 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Apr 5, 2026 at 4:50 PM Ashutosh Bapat
<[email protected]> wrote:
>
> 3. The test fails one one machine because RssShmem is consistently 8MB
> higher than the allocated_size in all cases. I guess it is because of
> huge page setting. Adding huge_pages = off to the test configuration.
> I think the test can not rely on huge pages anyway since
> allocated_size isn't aligned to huge page size.
Turning huge_pages = off didn't help. The test actually creates a
resizable shared memory structure which is 100s of MBs and adjusts
GUCs so that very minimum shared memory is allocated. This way the
resizable structure dominates the shared memory segment. Any small
variations in RssShmem because of parts of shared memory not accessed
by a backend can be ignored. Then it expects that the RssShmem of a
backend <= sum(allocated_size) from pg_shmem_allocations; usually
sum(allocated_size) - RssShmem ~= 2MB. That's not accurate but it's
the closest I can get to make sure that we do not over allocate memory
for resizable shared structures. There is something on
https://cirrus-ci.com/task/5501660157444096 which is mapping shared
memory worth 10MB, other than the main shared memory segment,
consistently in all the backends. Because of that RssShmem -
sum(allocated_size) is consistently ~= 8MB. I am not able to figure
out where that 10MB is coming from. If we could know that, we could
either disable the test on that machine or disable that allocation. On
all other CFBot VMs, the test is passing, including the platforms
where the feature is not supported.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 15:07 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 15:07 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 05/04/2026 02:17, Matthias van de Meent wrote:
> Formatting:
>> + ShmemRequestHash(.name = "pg_stat_statements hash",
>> + .nelems = pgss_max,
>> + .hash_info.keysize = sizeof(pgssHashKey),
>> + .hash_info.entrysize = sizeof(pgssEntry),
>> + .hash_flags = HASH_ELEM | HASH_BLOBS,
>> + .ptr = &pgss_hash,
>> + );
> (note that additional unit of indentation for the closing bracket)
>
> Is this malformatting caused by pgindent? If so, could you see if
> there's a better way of defining ShmemRequestHash/Struct that doesn't
> have this indent as output?
Yeah, that's pgindent. Matter of taste, but I think that looks fine. An
alternative is to put the closing bracket on the same line with the last
argument and drop the trailing comma:
ShmemRequestStruct(.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.ptr = (void **) &pgss);
That looks OK to me too.
>> + pgss->extent = 0;
>> + pgss->n_writers = 0;
>> + pgss->gc_count = 0;
>> + pgss->stats.dealloc = 0;
>
> Shmem is said to be zero-initialized, should we remove the manual
> zero-initialization?
>
>> + on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
> See my upthread comment about adding optional on_shmem_exit callbacks
> to ShmemCallbacks.
>
> 0005: LGTM
>
> 0006: I don't think it is a great idea to make the LwLock machinery
> the first to get allocation requests:
> It has the RequestNamedLWLockTranche infrastructure, which can only
> register new requests while process_shmem_requests_in_progress, and
> making it request its memory ahead of everything else is likely to
> cause an undersized tranche to be allocated.
Good catch. I think the easiest fix is to call process_shmem_requests()
before ShmemCallRequestCallbacks() at postmaster startup. That's kind of
how it was before: process_shmem_requests() was called before all the
*ShmemSize() and *ShmemInit() functions in core.
> You could make sure that this isn't an issue by maintaining a flag
> in lwlock.c that's set when the shmem request is made (and reset on
> shmem exit), which must be false when RequestNamedLWLockTranche() is
> called, and if not then it should throw an error.
I'll change it so that the number of locks calculated in
LWLockShmemRequest() is stored in a global variable, and
LWLockShmemInit() has an Assert() to cross-checks with that. That
catches the bug and seems like a good cross-check in general.
This isn't the only place where we need to pass information from the
request callback to the init callback. I've used a global variable for
that here, and also between ProcGlobalShmemRequest() and
ProcGlobalShmemRequest(). An alternative might be to use the callback
arg pointers that are currently unused, but I'm not sure how to make
that ergonomic. The current 'arg' isn't very helpful for that, so
perhaps the signatures should look like this instead:
static void
LWLockShmemRequest(Datum *init_arg)
{
int numLocks;
numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
/* pass the calculated numLocks value to LWLockShmemInit() */
*init_arg = Int32GetDatum(numLocks);
...
}
static void
LWLockShmemInit(Datum init_arg)
{
int numLocks = DatumGetIn32(numLocks);
...
}
If you need to pass more than a single Datum, you can allocate a struct
and pass it via PointerGetDatum(). At that point global variables might
feel simpler again though.
> 0010: Not looked at everything yet, but a few comment:
>
>> +++ b/src/include/access/slru.h
>
> With the changes in the signatures for most/all SLRU functions from a
> hidden-by-typedef pointer to a visible pointer type, maybe this could
> be an opportunity to swap them to `const SlruDesc *ctl` wherever
> possible? I don't think there are many backend-local changes that
> happen to SlruDescs once we've properly started the backend. I'm happy
> to provide an incremental patch if you'd like me to spend cycles on it
> if you're busy.
Yeah sounds like a good idea.
>> +++ b/src/backend/access/transam/clog.c
>
>> + SimpleLruRequest(.desc = &XactSlruDesc,
>> + .name = "transaction",
>> + .Dir = "pg_xact",
>> + .long_segment_names = false,
>> +
>> + .nslots = CLOGShmemBuffers(),
>> + .nlsns = CLOG_LSNS_PER_PAGE,
>> +
>> + .sync_handler = SYNC_HANDLER_CLOG,
>> + .PagePrecedes = CLOGPagePrecedes,
>> + .errdetail_for_io_error = clog_errdetail_for_io_error,
>
> That awfully inconsistent field name styling is ... awful, but not
> this patch's fault. If something can be done about it in a cheap
> fashion in this patch, that'd be great, but I won't hold it against
> you if that's skipped.
:-D.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 15:39 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 15:39 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 04/04/2026 19:32, Ashutosh Bapat wrote:
> test_shmem declares MODULE_big and OBJS which seems to be old
> fashioned, newer modules seem to be using MODULES.
I don't think it's a matter of old or new. MODULE_big is used when you
have multiple .o that are linked together into one .so file, while
MODULES is used if each .o file is linked into a separate .so file. If
there's only one .o file and .so file, then it doesn't really matter
which you use, and I think we have examples of both.
> Also it should use NO_INSTALLCHECK.
>
> /*
> * Alignment of the starting address. If not set, defaults to cacheline
> * boundary. Must be a power of two.
> */
> size_t alignment;
>
> We don't seem to enforce the "must be a power of two" rule anywhere.
> We should at least validate it.
Will add.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 15:50 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
2 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 15:50 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 05/04/2026 02:17, Matthias van de Meent wrote:
>> + pgss->extent = 0;
>> + pgss->n_writers = 0;
>> + pgss->gc_count = 0;
>> + pgss->stats.dealloc = 0;
>
> Shmem is said to be zero-initialized, should we remove the manual
> zero-initialization?
Yeah, perhaps. We already had initialization like this in many places,
while others relied on the implicit initialization. Some places even do
just this:
void
LogicalDecodingCtlShmemInit(void)
{
bool found;
LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
LogicalDecodingCtlShmemSize(),
&found);
if (!found)
MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
}
I think there are two directions we could go here:
1. Document that the memory is zeroed, and you can rely on it. Remove
silly initializations like that in LogicalDecodingCtlShmemInit(). In
other places the explicitly zero-initialization might have documentation
value though.
2. Require the init functions to explicitly zero the memory. Document it
and add valgrind checks.
I'm inclined to go with 1. But in the name of avoiding scope creep, not
as part of these patches.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 16:13 Heikki Linnakangas <[email protected]>
parent: Daniel Gustafsson <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 16:13 UTC (permalink / raw)
To: Daniel Gustafsson <[email protected]>; +Cc: Robert Haas <[email protected]>; Ashutosh Bapat <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 26/03/2026 20:31, Daniel Gustafsson wrote:
>> On 22 Mar 2026, at 01:14, Heikki Linnakangas <[email protected]> wrote:
>> * The request_fn callback is called in postmaster startup, at the same stage as the old shmem_request callback was. But in EXEC_BACKEND mode, it's *also* called in each backend.
>
> Should the request_fn be told, via an argument, from where it is called? It
> can be figured out but it's cleaner if all implementations will do it in the
> same way. I don't have a direct case in mind where it would be needed, but I
> was recently digging into SSL passphrase reloading which has failure cases
> precisely becasue of this so am thinking out loud to avoid similar problems
> here.
Hmm, you mean adding an argument along the lines of:
static void
pgss_shmem_request(void *arg, bool attaching)
{
...
}
Perhaps. The idea is that a request callback should generally do the
exact same thing whether it's called from postmaster or from backend
startup, though. I worry that an argument like that makes it too
tempting to have different logic. That said, there are a couple of
places where I'm using IsUnderPostmaster for that purpose. For example,
I have this in lwlock.c (in latest version I'm currently working on that
I haven't posted yet):
/* Size of MainLWLockArray. Only valid in postmaster. */
static int num_main_array_locks;
/*
* Request shmem space for user-defined tranches and the main LWLock array.
*/
static void
LWLockShmemRequest(void *arg)
{
size_t size;
/* Space for user-defined tranches */
ShmemRequestStruct(.name = "LWLock tranches",
.size = sizeof(LWLockTrancheShmemData),
.ptr = (void **) &LWLockTranches,
);
/* Space for the LWLock array */
if (!IsUnderPostmaster)
{
num_main_array_locks = NUM_FIXED_LWLOCKS +
NumLWLocksForNamedTranches();
size = num_main_array_locks * sizeof(LWLockPadded);
}
else
size = SHMEM_ATTACH_UNKNOWN_SIZE;
ShmemRequestStruct(.name = "Main LWLock array",
.size = size,
.ptr = (void **) &MainLWLockArray,
);
}
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 16:23 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-05 16:23 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, Apr 5, 2026 at 7:38 PM Ashutosh Bapat
<[email protected]> wrote:
>
> On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> <[email protected]> wrote:
> >
> >
> > I will post my resizable shmem structures patch in a separate email in
> > this thread but continue to review your patches.
>
> I reviewed the SLRU patch. This is the first time I am looking at SLRU
> code, so my review may not be sufficient. As far as I understand, the
> patch faithfully copies the functionality from the old system to the
> new system. I didn't find any issues there.
>
> I think calls to SimpleLruRequest() reads much better than SimpleLruInit().
>
> Both MultiXactShmemInit and MultiXactShmemAttach set
> OldestMemberMXactId, OldestVisibleMXactId. In future if we add another
> global variable to point to the shared memory, somebody needs to
> remember to initialize it in both these functions. Maybe deduplicate
> it with something like attached? Similarly for PredicateLock related
> changes.
Sorry, I attached the wrong patch. Here's the right patch.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[application/octet-stream] v11-0010-edits.patch.nocibot (2.7K, 2-v11-0010-edits.patch.nocibot)
download
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 18:35 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 18:35 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 05/04/2026 19:23, Ashutosh Bapat wrote:
>> Both MultiXactShmemInit and MultiXactShmemAttach set
>> OldestMemberMXactId, OldestVisibleMXactId. In future if we add another
>> global variable to point to the shared memory, somebody needs to
>> remember to initialize it in both these functions. Maybe deduplicate
>> it with something like attached? Similarly for PredicateLock related
>> changes.
>
> Sorry, I attached the wrong patch. Here's the right patch.
Gotcha, yeah I've thought about that too. I even considered making
ShmemInitRequested() automatically call all the attach callbacks after
initialization, even in !EXEC_BACKEND builds. That way, you could put
the backend-private steps only in the attach function, and have
automatically be called in the postmaster too. I decided against it,
because only few subsystems need the attach callback at all, and many of
them need to do the "local" steps earlier in the init callback anyway.
For example, XLOGShmemInit() uses the WALInsertLocks variable inside the
function already.
In the end I decided it's OK as it is. I'm not too worried about the
duplicated code in multixact.c and predicate.c, they fit in the same
screen in an editor so it's pretty easy to see that they are duplicated
for a reason. With more complicated logic it would be a different story.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 18:58 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 18:58 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 03/04/2026 01:10, Matthias van de Meent wrote:
> While I do think it's an improvement over the current APIs, the
> improvement seems to be mostly concentrated in the RequestStruct/Hash
> department, with only marginal improvements in RegisterShmemCallbacks.
> I feel like it's missing the important part: I'd like
> direct-from-_PG_init() ShmemRequestStruct/Hash calls. If
> ShmemRequestStruct/Hash had a size callback as alternative to the size
> field (which would then be called after preload_libraries finishes)
> then that would be sufficient for most shmem allocations, and it'd
> simplify shmem management for most subsystems.
> We'd still need the shmem lifecycle hooks/RegisterShmemCallbacks to
> allow conditionally allocated shmem areas (e.g. those used in aio),
> but I think that, in general, we shouldn't need a separate callback
> function just to get started registering shmem structures.
I kind of started from that thought too, but the design has since
evolved to what it is. Robert in particular was skeptical of that
approach, and I think I've come around on most of his feedback.
A per-struct size callback isn't very ergonomic in places like
LockManagerShmemRequest(), which derives the size of two different
things from the same calculated value. You'd need to repeat the
calculation for each one, or pass it through global variables or
something. PredicateLockShmemInit() is another extreme example. It
already has that problem to some extent, and already uses global
variables (I'm all ears if you have suggestions to improve it!) but I
feel that with a size callback this kind of stuff would get even harder.
Another reason is that it's good to have only one way of doing the
initialization, instead of one simple way and a different way for more
complex scenarios. If the simple way was vastly simpler, it might be
worth it, but I don't think there's that much difference here.
BTW, I also considered another way of initializing structs for simple
cases: instead of providing an init callback function, you could provide
the struct contents directly in the ShmemRequestStruct() call. To pick a
random example, slotsync.c currently looks like this:
static void
SlotSyncShmemRequest(void *arg)
{
ShmemRequestStruct(.name = "Slot Sync Data",
.size = sizeof(SlotSyncCtxStruct),
.ptr = (void **) &SlotSyncCtx,
);
}
static void
SlotSyncShmemInit(void *arg)
{
memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
SlotSyncCtx->pid = InvalidPid;
SpinLockInit(&SlotSyncCtx->mutex);
}
But it could look like this instead:
static void
SlotSyncShmemRequest(void *arg)
{
SlotSyncCtxStruct init_content;
memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
init_content.pid
SpinLockInit(&SlotSyncCtx->mutex);
ShmemRequestStruct(.name = "Slot Sync Data",
.size = sizeof(SlotSyncCtxStruct),
.ptr = (void **) &SlotSyncCtx,
.init_content = &init_content,
);
}
That'd only work for small, simple structs, though. Since the initial
contents would be copied in this model, it won't work for anything with
pointers to itself, for example. And it's not much less code after all.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 19:05 Matthias van de Meent <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-05 19:05 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, 5 Apr 2026 at 13:20, Ashutosh Bapat
<[email protected]> wrote:
>
> On Sun, Apr 5, 2026 at 2:36 PM Matthias van de Meent
> <[email protected]> wrote:
> >
> > On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <[email protected]> wrote:
> > >
> > > On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> > > <[email protected]> wrote:
> > > >
> >
> > I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
> > its platforms to make it part of the exposed ABI for Shmem? I think
> > that we should expose the same functions and structs, and just have
> > the shmem internals throw an error if the configuration used by the
> > user implies the user wants to update shmem sizing when the system
> > doesn't support it. That would avoid extensions having to recompile
> > between have/have not systems that have an otherwise compatible ABI;
> > especially when those extensions don't actually need the resizeable
> > part of the shmem system.
> >
>
> I don't think I understand this fully. An extension may want to
> support a structure in both modes - fixed as well as resizable
> depending upon whether the latter is supported. If the structure has
> maximum_size always the extension code needs to set it to 0 when the
> resizable shared structure is not supported and set to actual
> maximum_size when the resizable structure is supported. Without a
> macro or some flag they can not do that. The flag/macro then becomes
> part ABI for shmem. Am I correct?
That's not quite what I meant.
With your patch, the size and field offsets in `struct
ShmemStructOpts` changes depending only on HAVE_RESIZABLE_SHMEM, as
does function's availability. This means that an extension that's
built without HAVE_RESIZABLE_SHMEM (an otherwise identical system)
can't correctly be loaded into a server that does have
HAVE_RESIZABLE_SHMEM defined - or at least it'll misbehave when it
tries to use the new shmem system without trying out resizeable areas.
If instead the fields used for definining resizable shmem areas (and
the relevant functions) are always defined, but with runtime checks to
make sure that in !HAVE_RESIZEABLE_SHMEM nobody tries to use the
resizing functionality, then that'd reduce the unchecked hidden
incompatibility; assuming that no extension manually does memory
management syscall operations on those shmem areas.
> Since extension binaries need to be
> built on different platforms anyway, that would automatically take
> care of building with or without HAVE_RESIZABLE_SHMEM. I feel it makes
> testing simpler since run time behaviour is fixed. Maybe I am missing
> something. Maybe a code diff or some example platform might make it
> more clear for me.
I'm not entirely sure it would be automatic. Is it guaranteed that
HAVE_RESIZABLE_SHMEM won't change over the lifetime of any
distribution's platform? Because it's definitely not apparent to me
that rebuilding the new server version against an upgraded platform
(now possibly with HAVE_RESIZABLE_SHMEM) should also mean rebuilding
the extensions that have been built against a previous minor version
(without HAVE_RESIZABLE_SHMEM).
> > > For now, it
> > > seems only for the sanity checks, but it could be seen as a useful
> > > safety feature. A difference in maximum_size and minimum_size would
> > > indicate that the structure is resizable.
> >
> > I think that's the right approach.
>
>
> I also think that introducing minimum_size is useful. Let's hear from
> Heikki before implementing it, in case he has a different opinion. I
> am not sure about min_allocated_space though - what use do you see for
> it. reserved_space is useful in pg_shmem_allocations() C function
> itself and gives impact to the fully grown structure. What would
> min_allocated_space give us? If at all it would be min_allocated_size
> not space since reserved space will never change. But even that I am
> not sure about.
I'd say it's mostly interesting for people looking at or debugging
shmem allocations. Which isn't a huge group of developers or DBAs, but
if we're exposing data like this, and are going to allow resizing,
then someone could see some benefits from this.
E.g., it may be useful to have the information to see how low the
currently running server can scale down its memory usage, so that the
admin can see whether a reboot is required if they want to allow it to
scale it down further (assuming there's a lower limit for allocations
- some shmem structs may have a lower scaling limit defined at
startup, while others may be able to scale linearly from 0 to 100)
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 19:35 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 19:35 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 04/04/2026 16:51, Matthias van de Meent wrote:
>> +++ b/src/include/storage/shmem.h
>> +/*
>> + * Shared memory is reserved and allocated in stages at postmaster startup,
>> + * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
>> + * at backend startup. ShmemCallbacks holds callback functions that are
>> + * called at different stages.
>> + */
>> +typedef struct ShmemCallbacks
>
> Maybe this should also have the opportunity for a (before_)shmem_exit callback?
Hmm, yeah, perhaps, but I'm going to skip that for now. We already have
a mechanism for shmem-exit callbacks, and I'm not sure how that would
plug into this. I think we can do that later if it turns out to be a
good idea, and I don't think it changes the parts that's included in the
patches now.
>> + * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
>> + * called immediately after registration, to initialize or attach to the
>> + * requested shared memory areas.
>
> Ideally we only immediately call the callbacks if we're under
> postmaster, or in a standalone backend; we shouldn't allocate shmem
> for some preloaded libraries that set this flag, at least not ahead of
> loading all preload libraries.
Right, the SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP flag doesn't do anything
if called during shared_preload_libraries processing. I'll re-word the
comment to clarify that.
> While it's mostly mechanical changes, it did make me notice the rather
> annoying allocation patterns by XLOGShmemRequest. It allocates various
> types of data in one go (which, in principle, is fine) but in doing so
> it adds its own alignment tricks etc, and I'm not super stoked about
> that. If time allows, could we clean that up?
I'm not going to cram it into these patches, but +1.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 19:58 Matthias van de Meent <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-05 19:58 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Sun, 5 Apr 2026 at 17:07, Heikki Linnakangas <[email protected]> wrote:
>
> On 05/04/2026 02:17, Matthias van de Meent wrote:
> > Is this malformatting caused by pgindent? If so, could you see if
> > there's a better way of defining ShmemRequestHash/Struct that doesn't
> > have this indent as output?
>
> Yeah, that's pgindent. Matter of taste, but I think that looks fine. An
> alternative is to put the closing bracket on the same line with the last
> argument and drop the trailing comma:
>
> ShmemRequestStruct(.name = "pg_stat_statements",
> .size = sizeof(pgssSharedState),
> .ptr = (void **) &pgss);
>
> That looks OK to me too.
Then let's keep it as per the v11 patch with ugly closing indents --
hopefully someone will fix pgindent in the future, but that's not the
job of this patch.
> > You could make sure that this isn't an issue by maintaining a flag
> > in lwlock.c that's set when the shmem request is made (and reset on
> > shmem exit), which must be false when RequestNamedLWLockTranche() is
> > called, and if not then it should throw an error.
> I'll change it so that the number of locks calculated in
> LWLockShmemRequest() is stored in a global variable, and
> LWLockShmemInit() has an Assert() to cross-checks with that. That
> catches the bug and seems like a good cross-check in general.
Thanks for fixing this!
> > 0010: Not looked at everything yet, but a few comment:
> >
> >> +++ b/src/include/access/slru.h
> >
> > With the changes in the signatures for most/all SLRU functions from a
> > hidden-by-typedef pointer to a visible pointer type, maybe this could
> > be an opportunity to swap them to `const SlruDesc *ctl` wherever
> > possible? I don't think there are many backend-local changes that
> > happen to SlruDescs once we've properly started the backend. I'm happy
> > to provide an incremental patch if you'd like me to spend cycles on it
> > if you're busy.
>
> Yeah sounds like a good idea.
Attached 2 incremental patches:
0001 constifies the expected function arguments, which I propose to include; and
0002 adds 'type* const struct fields' in SlruShared.
0002 was not requested, but it looked feasible to at least try it out
in this subsystem. It might be interesting, but you're free to drop
it.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
Attachments:
[application/octet-stream] nocfbot.v11-0002-Const-qualify-SlruShared-s-fields-and-i.patch (6.7K, 2-nocfbot.v11-0002-Const-qualify-SlruShared-s-fields-and-i.patch)
download | inline diff:
From 06aa3c327d7ba21a70d416a1bf7a9bc955f629c6 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <[email protected]>
Date: Sun, 5 Apr 2026 21:48:23 +0200
Subject: [PATCH vnocfbot.v11 2/2] Const-qualify SlruShared's fields and
internal pointers
These fields are expected to never update, so by const-qualifying
them we protect against inadvertent modification and we allow the
compiler to apply a few more optimizations it previously may not
have been able to prove.
---
src/backend/access/transam/slru.c | 82 ++++++++++++++++++++-----------
src/include/access/slru.h | 24 ++++-----
2 files changed, 66 insertions(+), 40 deletions(-)
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a1e688dd702..cb2bdb35cf4 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -278,6 +278,15 @@ shmem_slru_init(void *location, ShmemStructOpts *base_options)
int nlsns = options->nlsns;
char *ptr;
Size offset;
+ char **page_buffer;
+ SlruPageStatus *page_status;
+ bool *page_dirty;
+ int64 *page_number;
+ int *page_lru_count;
+ LWLockPadded *buffer_locks;
+ LWLockPadded *bank_locks;
+ int *bank_cur_lru_count;
+ XLogRecPtr *group_lsn;
shared = (SlruShared) location;
desc->shared = shared;
@@ -300,62 +309,79 @@ shmem_slru_init(void *location, ShmemStructOpts *base_options)
memset(shared, 0, sizeof(SlruSharedData));
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
-
ptr = (char *) shared;
offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
+ page_buffer = (char **) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
+ page_status = (SlruPageStatus *) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
+ page_dirty = (bool *) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
+ page_number = (int64 *) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
+ page_lru_count = (int *) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(int));
/* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ buffer_locks = (LWLockPadded *) (ptr + offset);
offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ bank_locks = (LWLockPadded *) (ptr + offset);
offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
+ bank_cur_lru_count = (int *) (ptr + offset);
offset += MAXALIGN(nbanks * sizeof(int));
if (nlsns > 0)
{
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ group_lsn = (XLogRecPtr *) (ptr + offset);
offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
+ else
+ group_lsn = NULL;
+
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ bank_cur_lru_count[bankno] = 0;
+ }
ptr += BUFFERALIGN(offset);
for (int slotno = 0; slotno < nslots; slotno++)
{
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ LWLockInitialize(&buffer_locks[slotno].lock,
desc->options.buffer_tranche_id);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
+ page_buffer[slotno] = ptr;
+ page_status[slotno] = SLRU_PAGE_EMPTY;
+ page_dirty[slotno] = false;
+ page_lru_count[slotno] = 0;
ptr += BLCKSZ;
}
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
-
/* Should fit to estimated shmem size */
Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+
+
+ {
+ SlruSharedData template = {
+ .num_slots = nslots,
+ .lsn_groups_per_page = nlsns,
+ .page_buffer = page_buffer,
+ .page_status = page_status,
+ .page_dirty = page_dirty,
+ .page_number = page_number,
+ .page_lru_count = page_lru_count,
+ .buffer_locks = buffer_locks,
+ .bank_locks = bank_locks,
+ .bank_cur_lru_count = bank_cur_lru_count,
+ .group_lsn = group_lsn,
+ .slru_stats_idx = pgstat_get_slru_index(desc->options.name),
+ };
+
+ pg_atomic_init_u64(&template.latest_page_number, 0);
+
+ memcpy(shared, &template, sizeof(SlruSharedData));
+ }
}
void
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index 74ea84deec5..b6854af3119 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -48,23 +48,23 @@ typedef enum
typedef struct SlruSharedData
{
/* Number of buffers managed by this SLRU structure */
- int num_slots;
+ const int num_slots;
/*
* Arrays holding info for each buffer slot. Page number is undefined
* when status is EMPTY, as is page_lru_count.
*/
- char **page_buffer;
- SlruPageStatus *page_status;
- bool *page_dirty;
- int64 *page_number;
- int *page_lru_count;
+ char *const *const page_buffer;
+ SlruPageStatus *const page_status;
+ bool *const page_dirty;
+ int64 *const page_number;
+ int *const page_lru_count;
/* The buffer_locks protects the I/O on each buffer slots */
- LWLockPadded *buffer_locks;
+ LWLockPadded *const buffer_locks;
/* Locks to protect the in memory buffer slot access in SLRU bank. */
- LWLockPadded *bank_locks;
+ LWLockPadded *const bank_locks;
/*----------
* A bank-wise LRU counter is maintained because we do a victim buffer
@@ -81,7 +81,7 @@ typedef struct SlruSharedData
* works as long as no page's age exceeds INT_MAX counts.
*----------
*/
- int *bank_cur_lru_count;
+ int *const bank_cur_lru_count;
/*
* Optional array of WAL flush LSNs associated with entries in the SLRU
@@ -91,8 +91,8 @@ typedef struct SlruSharedData
* highest LSN known for a contiguous group of SLRU entries on that slot's
* page.
*/
- XLogRecPtr *group_lsn;
- int lsn_groups_per_page;
+ XLogRecPtr *const group_lsn;
+ const int lsn_groups_per_page;
/*
* latest_page_number is the page number of the current end of the log;
@@ -102,7 +102,7 @@ typedef struct SlruSharedData
pg_atomic_uint64 latest_page_number;
/* SLRU's index for statistics purposes (might not be unique) */
- int slru_stats_idx;
+ const int slru_stats_idx;
} SlruSharedData;
typedef SlruSharedData *SlruShared;
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot.v11-0001-Add-const-qualification-in-SLRU-subsyst.patch (15.3K, 3-nocfbot.v11-0001-Add-const-qualification-in-SLRU-subsyst.patch)
download | inline diff:
From 53f491e2565f42fc1fcb3612fe821b1c3b796d4b Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <[email protected]>
Date: Sun, 5 Apr 2026 21:46:31 +0200
Subject: [PATCH vnocfbot.v11 1/2] Add const qualification in SLRU subsystem
With SlruCtl being replaced with SlruDesc*, we can const-qualify the
input arguments of various SLRU functions without significant effort.
In passing, some other arguments are also const-qualified.
---
src/backend/access/transam/slru.c | 78 ++++++++++++++------------
src/include/access/slru.h | 38 +++++++------
src/test/modules/test_slru/test_slru.c | 3 +-
3 files changed, 63 insertions(+), 56 deletions(-)
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 47dd52d6749..a1e688dd702 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -91,7 +91,7 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruDesc *ctl, char *path, int64 segno)
+SlruFileName(const SlruDesc *ctl, char *path, int64 segno)
{
if (ctl->options.long_segment_names)
{
@@ -178,19 +178,22 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
-static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
-static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(const SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(const SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(const SlruDesc *ctl, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
+static bool SlruPhysicalReadPage(const SlruDesc *ctl, int64 pageno,
+ int slotno);
+static bool SlruPhysicalWritePage(const SlruDesc *ctl, int64 pageno,
+ int slotno, SlruWriteAll fdata);
+static void SlruReportIOError(const SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
+static int SlruSelectLRUPage(const SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(const SlruDesc *ctl,
+ const char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
+static void SlruInternalDeleteSegment(const SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -394,7 +397,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
+SimpleLruZeroPage(const SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -447,7 +450,7 @@ SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
+SimpleLruZeroLSNs(const SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -463,7 +466,7 @@ SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
+SimpleLruZeroAndWritePage(const SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -489,7 +492,7 @@ SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruDesc *ctl, int slotno)
+SimpleLruWaitIO(const SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -547,7 +550,7 @@ SimpleLruWaitIO(SlruDesc *ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(const SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -651,7 +654,7 @@ SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(const SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -698,7 +701,7 @@ SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(const SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -778,7 +781,7 @@ SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruDesc *ctl, int slotno)
+SimpleLruWritePage(const SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -792,7 +795,7 @@ SimpleLruWritePage(SlruDesc *ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(const SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -850,7 +853,7 @@ SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(const SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -922,7 +925,7 @@ SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(const SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1094,7 +1097,7 @@ SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdat
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(const SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1216,7 +1219,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
+SlruSelectLRUPage(const SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1369,7 +1372,7 @@ SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
+SimpleLruWriteAll(const SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1455,7 +1458,7 @@ SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
+SimpleLruTruncate(const SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1550,7 +1553,7 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
+SlruInternalDeleteSegment(const SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
@@ -1573,7 +1576,7 @@ SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruDesc *ctl, int64 segno)
+SlruDeleteSegment(const SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1650,7 +1653,7 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(const SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
@@ -1662,7 +1665,7 @@ SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(const SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1747,7 +1750,7 @@ SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
+SlruPagePrecedesUnitTests(const SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1762,8 +1765,8 @@ SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
- void *data)
+SlruScanDirCbReportPresence(const SlruDesc *ctl, const char *filename,
+ int64 segpage, void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1778,7 +1781,7 @@ SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(const SlruDesc *ctl, const char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1794,7 +1797,8 @@ SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(const SlruDesc *ctl, const char *filename,
+ int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1808,7 +1812,7 @@ SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
+SlruCorrectSegmentFilenameLength(const SlruDesc *ctl, size_t len)
{
if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
@@ -1841,7 +1845,7 @@ SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(const SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1881,7 +1885,7 @@ SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(const SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index 36a7514d7a0..74ea84deec5 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -200,7 +200,7 @@ typedef struct SlruDesc
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
+SimpleLruGetBankLock(const SlruDesc *ctl, int64 pageno)
{
int bankno;
@@ -215,34 +215,36 @@ extern void SimpleLruRequestWithOpts(const SlruOpts *options);
SimpleLruRequestWithOpts(&(SlruOpts){__VA_ARGS__})
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(const SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(const SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(const SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(const SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
-extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(const SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(const SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(const SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruTruncate(const SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(const SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
- void *data);
-extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
+typedef bool (*SlruScanCallback) (const SlruDesc *ctl, const char *filename,
+ int64 segpage, void *data);
+extern bool SlruScanDirectory(const SlruDesc *ctl, SlruScanCallback callback,
+ void *data);
+extern void SlruDeleteSegment(const SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(const SlruDesc *ctl, const FileTag *ftag,
+ char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(const SlruDesc *ctl, const char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
- void *data);
+extern bool SlruScanDirCbDeleteAll(const SlruDesc *ctl, const char *filename,
+ int64 segpage, void *data);
extern bool check_slru_buffers(const char *name, int *newval);
extern void shmem_slru_init(void *location, ShmemStructOpts *options);
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index 40efffdbf62..5dfa082ed25 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -55,7 +55,8 @@ static const ShmemCallbacks test_slru_shmem_callbacks = {
#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(const SlruDesc *ctl, const char *filename, int64 segpage,
+ void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 20:06 Heikki Linnakangas <[email protected]>
parent: Matthias van de Meent <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 20:06 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Here's patch version 12 [*]. I believe I've addressed all the feedback,
and I feel this is in pretty good shape now. There hasn't been any big
design changes lately.
One notable change is that I replaced the separate
{request|init|attach}_fn_arg fields in ShmemCallbacks with a single
'opaque_arg' field, and added a brief comment to it. You both commented
on whether we need that at all, and maybe you're right that we don't,
but at least it's now just one field rather than three. As before,
callers can simply ignore it if they don't need it.
[*] also available at
https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-12
- Heikki
Attachments:
[text/x-patch] v12-0001-Move-some-code-from-shmem.c-and-shmem.h.patch (15.9K, 2-v12-0001-Move-some-code-from-shmem.c-and-shmem.h.patch)
download | inline diff:
From 7ce3796647a871004fa9e29ec6fab8890701f749 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:40:30 +0300
Subject: [PATCH v12 01/13] Move some code from shmem.c and shmem.h
A little refactoring in preparation for the next commit, to make the
material changes in that commit more clear.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/postmaster/launch_backend.c | 1 +
src/backend/storage/ipc/Makefile | 1 +
src/backend/storage/ipc/ipci.c | 1 +
src/backend/storage/ipc/meson.build | 1 +
src/backend/storage/ipc/shmem.c | 119 +-------------------
src/backend/storage/ipc/shmem_hash.c | 141 ++++++++++++++++++++++++
src/include/storage/shmem.h | 29 ++---
src/include/storage/shmem_internal.h | 41 +++++++
8 files changed, 196 insertions(+), 138 deletions(-)
create mode 100644 src/backend/storage/ipc/shmem_hash.c
create mode 100644 src/include/storage/shmem_internal.h
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 434e0643022..15b136ea29d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -50,6 +50,7 @@
#include "storage/dsm.h"
#include "storage/io_worker.h"
#include "storage/pg_shmem.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 9a07f6e1d92..f71653bbe48 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -22,6 +22,7 @@ OBJS = \
shm_mq.o \
shm_toc.o \
shmem.o \
+ shmem_hash.o \
signalfuncs.o \
sinval.o \
sinvaladt.o \
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7aab5da3386..ca4e4727489 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -50,6 +50,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
diff --git a/src/backend/storage/ipc/meson.build b/src/backend/storage/ipc/meson.build
index 9c1ca954d9d..b8c31e29967 100644
--- a/src/backend/storage/ipc/meson.build
+++ b/src/backend/storage/ipc/meson.build
@@ -14,6 +14,7 @@ backend_sources += files(
'shm_mq.c',
'shm_toc.c',
'shmem.c',
+ 'shmem_hash.c',
'signalfuncs.c',
'sinval.c',
'sinvaladt.c',
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 3cb51ad62f8..e91c47d4d97 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -70,6 +70,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/tuplestore.h"
@@ -96,9 +97,6 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
-static void *ShmemHashAlloc(Size size, void *alloc_arg);
static void *ShmemAllocRaw(Size size, Size *allocated_size);
/* shared memory global variables */
@@ -115,16 +113,6 @@ static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
-/*
- * A very simple allocator used to carve out different parts of a hash table
- * from a previously allocated contiguous shared memory area.
- */
-typedef struct shmem_hash_allocator
-{
- char *next; /* start of free space in the area */
- char *end; /* end of the shmem area */
-} shmem_hash_allocator;
-
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -257,29 +245,6 @@ ShmemAllocNoError(Size size)
return ShmemAllocRaw(size, &allocated_size);
}
-/*
- * ShmemHashAlloc -- alloc callback for shared memory hash tables
- *
- * Carve out the allocation from a pre-allocated region. All shared memory
- * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
- * happen upfront during initialization and no locking is required.
- */
-static void *
-ShmemHashAlloc(Size size, void *alloc_arg)
-{
- shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
- void *result;
-
- size = MAXALIGN(size);
-
- if (allocator->end - allocator->next < size)
- return NULL;
- result = allocator->next;
- allocator->next += size;
-
- return result;
-}
-
/*
* ShmemAllocRaw -- allocate align chunk and return allocated size
*
@@ -341,88 +306,6 @@ ShmemAddrIsValid(const void *addr)
return (addr >= ShmemBase) && (addr < ShmemEnd);
}
-/*
- * ShmemInitHash -- Create and initialize, or attach to, a
- * shared memory hash table.
- *
- * We assume caller is doing some kind of synchronization
- * so that two processes don't try to create/initialize the same
- * table at once. (In practice, all creations are done in the postmaster
- * process; child processes should always be attaching to existing tables.)
- *
- * nelems is the maximum number of hashtable entries.
- *
- * *infoP and hash_flags must specify at least the entry sizes and key
- * comparison semantics (see hash_create()). Flag bits and values specific
- * to shared-memory hash tables are added here, except that callers may
- * choose to specify HASH_PARTITION.
- *
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
- */
-HTAB *
-ShmemInitHash(const char *name, /* table string name for shmem index */
- int64 nelems, /* size of the table */
- HASHCTL *infoP, /* info about key and bucket size */
- int hash_flags) /* info about infoP */
-{
- bool found;
- size_t size;
- void *location;
-
- size = hash_estimate_size(nelems, infoP->entrysize);
-
- /* look it up in the shmem index or allocate */
- location = ShmemInitStruct(name, size, &found);
-
- return shmem_hash_create(location, size, found,
- name, nelems, infoP, hash_flags);
-}
-
-/*
- * Initialize or attach to a shared hash table in the given shmem region.
- *
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
- */
-static HTAB *
-shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
-{
- shmem_hash_allocator allocator;
-
- /*
- * Hash tables allocated in shared memory have a fixed directory and have
- * all elements allocated upfront. We don't support growing because we'd
- * need to grow the underlying shmem region with it.
- *
- * The shared memory allocator must be specified too.
- */
- infoP->alloc = ShmemHashAlloc;
- infoP->alloc_arg = NULL;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
-
- /*
- * if it already exists, attach to it rather than allocate and initialize
- * new space
- */
- if (!found)
- {
- allocator.next = (char *) location;
- allocator.end = (char *) location + size;
- infoP->alloc_arg = &allocator;
- }
- else
- {
- /* Pass location of hashtable header to hash_create */
- infoP->hctl = (HASHHDR *) location;
- hash_flags |= HASH_ATTACH;
- }
-
- return hash_create(name, nelems, infoP, hash_flags);
-}
-
/*
* ShmemInitStruct -- Create/attach to a structure in shared memory.
*
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
new file mode 100644
index 00000000000..721dec5c17f
--- /dev/null
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -0,0 +1,141 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_hash.c
+ * hash table implementation in shared memory
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * A shared memory hash table implementation on top of the named, fixed-size
+ * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
+ * size, but their actual size can vary dynamically. When entries are added
+ * to the table, more space is allocated. Each shared data structure and hash
+ * has a string name to identify it.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/ipc/shmem_hash.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
+
+/*
+ * A very simple allocator used to carve out different parts of a hash table
+ * from a previously allocated contiguous shared memory area.
+ */
+typedef struct shmem_hash_allocator
+{
+ char *next; /* start of free space in the area */
+ char *end; /* end of the shmem area */
+} shmem_hash_allocator;
+
+static void *ShmemHashAlloc(Size size, void *alloc_arg);
+
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * nelems is the maximum number of hashtable entries.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 nelems, /* size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ bool found;
+ size_t size;
+ void *location;
+
+ size = hash_estimate_size(nelems, infoP->entrysize);
+
+ /* look it up in the shmem index or allocate */
+ location = ShmemInitStruct(name, size, &found);
+
+ return shmem_hash_create(location, size, found,
+ name, nelems, infoP, hash_flags);
+}
+
+/*
+ * Initialize or attach to a shared hash table in the given shmem region.
+ *
+ * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
+ * share the logic for bootstrapping the ShmemIndex hash table.
+ */
+HTAB *
+shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags)
+{
+ shmem_hash_allocator allocator;
+
+ /*
+ * Hash tables allocated in shared memory have a fixed directory and have
+ * all elements allocated upfront. We don't support growing because we'd
+ * need to grow the underlying shmem region with it.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->alloc = ShmemHashAlloc;
+ infoP->alloc_arg = NULL;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_FIXED_SIZE;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ if (!found)
+ {
+ allocator.next = (char *) location;
+ allocator.end = (char *) location + size;
+ infoP->alloc_arg = &allocator;
+ }
+ else
+ {
+ /* Pass location of hashtable header to hash_create */
+ infoP->hctl = (HASHHDR *) location;
+ hash_flags |= HASH_ATTACH;
+ }
+
+ return hash_create(name, nelems, infoP, hash_flags);
+}
+
+/*
+ * ShmemHashAlloc -- alloc callback for shared memory hash tables
+ *
+ * Carve out the allocation from a pre-allocated region. All shared memory
+ * hash tables are initialized with HASH_FIXED_SIZE, so all the allocations
+ * happen upfront during initialization and no locking is required.
+ */
+static void *
+ShmemHashAlloc(Size size, void *alloc_arg)
+{
+ shmem_hash_allocator *allocator = (shmem_hash_allocator *) alloc_arg;
+ void *result;
+
+ size = MAXALIGN(size);
+
+ if (allocator->end - allocator->next < size)
+ return NULL;
+ result = allocator->next;
+ allocator->next += size;
+
+ return result;
+}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index a2eb499d63c..81d05381f8f 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -3,6 +3,11 @@
* shmem.h
* shared memory management structures
*
+ * This file contains public functions for other core subsystems and
+ * extensions to allocate shared memory. Internal functions for the shmem
+ * allocator itself and hooking it to the rest of the system are in
+ * shmem_internal.h
+ *
* Historical note:
* A long time ago, Postgres' shared memory region was allowed to be mapped
* at a different address in each process, and shared memory "pointers" were
@@ -25,36 +30,20 @@
/* shmem.c */
-typedef struct PGShmemHeader PGShmemHeader; /* avoid including
- * storage/pg_shmem.h here */
-extern void InitShmemAllocator(PGShmemHeader *seghdr);
extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+/* shmem_hash.c */
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
-
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
-
#endif /* SHMEM_H */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
new file mode 100644
index 00000000000..e0638135639
--- /dev/null
+++ b/src/include/storage/shmem_internal.h
@@ -0,0 +1,41 @@
+/*-------------------------------------------------------------------------
+ *
+ * shmem_internal.h
+ * Internal functions related to shmem allocation
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/shmem_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SHMEM_INTERNAL_H
+#define SHMEM_INTERNAL_H
+
+#include "storage/shmem.h"
+#include "utils/hsearch.h"
+
+typedef struct PGShmemHeader PGShmemHeader; /* avoid including
+ * storage/pg_shmem.h here */
+extern void InitShmemAllocator(PGShmemHeader *seghdr);
+
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+
+/* size constants for the shmem index table */
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+ /* max number of named shmem structures and hash tables */
+#define SHMEM_INDEX_SIZE (256)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
+
+#endif /* SHMEM_INTERNAL_H */
--
2.47.3
[text/x-patch] v12-0002-Introduce-a-new-mechanism-for-registering-shared.patch (62.0K, 3-v12-0002-Introduce-a-new-mechanism-for-registering-shared.patch)
download | inline diff:
From 1065420246d3d3885a4fa850916a8a35848d5c9b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 23:03:20 +0300
Subject: [PATCH v12 02/13] Introduce a new mechanism for registering shared
memory areas
This replaces the [Subsystem]ShmemSize() and [Subsystem]ShmemInit()
functions called at postmaster startup with a new set of
callbacks. The new mechanism is designed to be more
ergonomic. Notably, the size of each shmem area is specified in the
same ShmemRequestStruct() call, together with its name. The same
mechanism is used in extensions, replacing the
shmem_{request/startup}_hooks.
ShmemInitStruct() and ShmemInitHash() become backwards-compatibility
wrappers around the new functions. In future commits, I will replace
all ShmemInitStruct() and ShmemInitHash() calls with the new
functions, although we'll still need to keep them around for
extensions.
Co-authored-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
doc/src/sgml/system-views.sgml | 4 +-
doc/src/sgml/xfunc.sgml | 162 +++--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 4 +
src/backend/postmaster/postmaster.c | 19 +-
src/backend/storage/ipc/ipci.c | 29 +-
src/backend/storage/ipc/shmem.c | 826 +++++++++++++++++++++---
src/backend/storage/ipc/shmem_hash.c | 85 ++-
src/backend/storage/lmgr/proc.c | 3 +
src/backend/tcop/postgres.c | 10 +-
src/backend/utils/hash/dynahash.c | 4 +-
src/include/storage/shmem.h | 160 ++++-
src/include/storage/shmem_internal.h | 41 +-
src/tools/pgindent/typedefs.list | 9 +-
14 files changed, 1155 insertions(+), 203 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..2ebec6928d5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<para>
Anonymous allocations are allocations that have been made
with <literal>ShmemAlloc()</literal> directly, rather than via
- <literal>ShmemInitStruct()</literal> or
- <literal>ShmemInitHash()</literal>.
+ <literal>ShmemRequestStruct()</literal> or
+ <literal>ShmemRequestHash()</literal>.
</para>
<para>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 70e815b8a2c..aed3f2f0071 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3628,71 +3628,132 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
Add-ins can reserve shared memory on server startup. To do so, the
add-in's shared library must be preloaded by specifying it in
<xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
- The shared library should also register a
- <literal>shmem_request_hook</literal> in its
- <function>_PG_init</function> function. This
- <literal>shmem_request_hook</literal> can reserve shared memory by
- calling:
+ The shared library should register callbacks in
+ its <function>_PG_init</function> function, which then get called at the
+ right stages of the system startup to initialize the shared memory.
+ Here is an example:
<programlisting>
-void RequestAddinShmemSpace(Size size)
-</programlisting>
- Each backend should obtain a pointer to the reserved shared memory by
- calling:
-<programlisting>
-void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
-</programlisting>
- If this function sets <literal>foundPtr</literal> to
- <literal>false</literal>, the caller should proceed to initialize the
- contents of the reserved shared memory. If <literal>foundPtr</literal>
- is set to <literal>true</literal>, the shared memory was already
- initialized by another backend, and the caller need not initialize
- further.
- </para>
+typedef struct MyShmemData {
+ LWLock lock; /* protects the fields below */
- <para>
- To avoid race conditions, each backend should use the LWLock
- <function>AddinShmemInitLock</function> when initializing its allocation
- of shared memory, as shown here:
-<programlisting>
-static mystruct *ptr = NULL;
-bool found;
+ ... shared memory contents ...
+} MyShmemData;
+
+static MyShmemData *MyShmem; /* pointer to the struct in shared memory */
+
+static void my_shmem_request(void *arg);
+static void my_shmem_init(void *arg);
+
+const ShmemCallbacks my_shmem_callbacks = {
+ .request_fn = my_shmem_request,
+ .init_fn = my_shmem_init,
+};
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /* Register our shared memory needs */
+ RegisterShmemCallbacks(&my_shmem_callbacks);
+}
+
+/* callback to request */
+static void
+my_shmem_request(void *arg)
+{
+ /* A persistent handle to the shared memory area in this backend */
+ static ShmemStructDesc MyShmemDesc;
+
+ ShmemRequestStruct(&MyShmemDesc,
+ .name = "My shmem area",
+ .size = sizeof(MyShmemData),
+ .ptr = (void **) &MyShmem,
+ );
+}
-LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-ptr = ShmemInitStruct("my struct name", size, &found);
-if (!found)
+/* callback to initialize the contents of the MyShmem area at startup */
+static void
+my_shmem_init(void *arg)
{
- ... initialize contents of shared memory ...
- ptr->locks = GetNamedLWLockTranche("my tranche name");
+ int tranche_id;
+
+ /* Initialize the lock */
+ tranche_id = LWLockNewTrancheId("my tranche name");
+ LWLockInitialize(&MyShmem->lock, tranche_id);
+
+ ... initialize the rest of MyShmem fields ...
}
-LWLockRelease(AddinShmemInitLock);
+
</programlisting>
- <literal>shmem_startup_hook</literal> provides a convenient place for the
- initialization code, but it is not strictly required that all such code
- be placed in this hook. On Windows (and anywhere else where
- <literal>EXEC_BACKEND</literal> is defined), each backend executes the
- registered <literal>shmem_startup_hook</literal> shortly after it
- attaches to shared memory, so add-ins should still acquire
- <function>AddinShmemInitLock</function> within this hook, as shown in the
- example above. On other platforms, only the postmaster process executes
- the <literal>shmem_startup_hook</literal>, and each backend automatically
- inherits the pointers to shared memory.
+ The <function>request_fn</function> callback is called during system
+ startup, before the shared memory has been allocated. It should call
+ <function>ShmemRequestStruct()</function> to register the add-in's
+ shared memory needs. Note that <function>ShmemRequestStruct()</function>
+ doesn't immediately allocate or initialize the memory, it merely
+ registers the space to be allocated later in the startup sequence. When
+ the memory is allocated, it is initialized to zero. For any more
+ complex initialization, set the <function>init_fn()</function> callback,
+ which will be called after the memory has been allocated and initialized
+ to zero, but before any other processes are running, and thus no locking
+ is required.
</para>
-
<para>
- An example of a <literal>shmem_request_hook</literal> and
- <literal>shmem_startup_hook</literal> can be found in
+ On Windows, the <function>attach_fn</function> callback, if any, is
+ additionally called at every backend startup. It can be used to
+ initialize additional per-backend state related to the shared memory
+ area that is inherited via <function>fork()</function> on other systems.
+ </para>
+ <para>
+ An example of allocating shared memory can be found in
<filename>contrib/pg_stat_statements/pg_stat_statements.c</filename> in
the <productname>PostgreSQL</productname> source tree.
</para>
</sect3>
<sect3 id="xfunc-shared-addin-after-startup">
- <title>Requesting Shared Memory After Startup</title>
+ <title>Requesting Shared Memory After Startup with <function>ShmemRequestStruct</function></title>
+
+ <para>
+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves some memory for allocations
+ after startup, but that reservation is small.
+ </para>
+ <para>
+ By default, <function>RegisterShmemCallbacks()</function> fails with an
+ error if called after system startup. To use it after startup, you must
+ set the <literal>SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP</literal> flag in
+ the argument <structname>ShmemCallbacks</structname> struct to
+ acknowledge the risk.
+ </para>
+ <para>
+ When <function>RegisterShmemCallbacks()</function> is called after
+ startup, it will immediately call the appropriate callbacks, depending
+ on whether the requested memory areas were already initialized by
+ another backend. The callbacks will be called while holding an internal
+ lock, which prevents concurrent two backends from initializating the
+ memory area concurrently.
+ </para>
+ </sect3>
+
+ <sect3 id="xfunc-shared-addin-dynamic">
+ <title>Allocating Dynamic Shared Memory After Startup</title>
<para>
There is another, more flexible method of reserving shared memory that
- can be done after server startup and outside a
- <literal>shmem_request_hook</literal>. To do so, each backend that will
+ can be done after server startup. To do so, each backend that will
use the shared memory should obtain a pointer to it by calling:
<programlisting>
void *GetNamedDSMSegment(const char *name, size_t size,
@@ -3711,10 +3772,7 @@ void *GetNamedDSMSegment(const char *name, size_t size,
</para>
<para>
- Unlike shared memory reserved at server startup, there is no need to
- acquire <function>AddinShmemInitLock</function> or otherwise take action
- to avoid race conditions when reserving shared memory with
- <function>GetNamedDSMSegment</function>. This function ensures that only
+ <function>GetNamedDSMSegment</function> ensures that only
one backend allocates and initializes the segment and that all other
backends receive a pointer to the fully allocated and initialized
segment.
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index ebd41176b94..3766d8231ac 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -373,6 +374,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 15b136ea29d..0973010b7dc 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
@@ -673,7 +674,10 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();
+ }
/*
* Run the appropriate Main function
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eb4f3eb72d4..7a8ee19bdaf 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -115,6 +115,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/shmem_internal.h"
#include "tcop/backend_startup.h"
#include "tcop/tcopprot.h"
#include "utils/datetime.h"
@@ -951,7 +952,14 @@ PostmasterMain(int argc, char *argv[])
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might've been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
@@ -3232,7 +3240,14 @@ PostmasterStateMachine(void)
/* re-read control file into local memory */
LocalProcessControlFile(true);
- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterBuiltinShmemCallbacks(), we keep the old registrations. In
+ * order to re-register structs in extensions, we'd need to reload
+ * shared preload libraries, and we don't want to do that.
+ */
+ ResetShmemAllocator();
+ ShmemCallRequestCallbacks();
CreateSharedMemoryAndSemaphores();
UpdatePMState(PM_STARTUP);
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index ca4e4727489..24422a80ab3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,8 +101,9 @@ CalculateShmemSize(void)
* during the actual allocation phase.
*/
size = 100000;
- size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
- sizeof(ShmemIndexEnt)));
+ size = add_size(size, ShmemGetRequestedSize());
+
+ /* legacy subsystems */
size = add_size(size, dsm_estimate_size());
size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
@@ -177,6 +178,13 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
+ /*
+ * Attach to LWLocks first. They are needed by most other subsystems.
+ */
+ LWLockShmemInit();
+
+ /* Establish pointers to all shared memory areas in this backend */
+ ShmemAttachRequested();
CreateOrAttachShmemStructs();
/*
@@ -221,7 +229,17 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /* Initialize subsystems */
+ /*
+ * Initialize LWLocks first, in case any of the shmem init function use
+ * LWLocks. (Nothing else can be running during startup, so they don't
+ * need to do any locking yet, but we nevertheless allow it.)
+ */
+ LWLockShmemInit();
+
+ /* Initialize all shmem areas */
+ ShmemInitRequested();
+
+ /* Initialize legacy subsystems */
CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */
@@ -252,11 +270,6 @@ CreateSharedMemoryAndSemaphores(void)
static void
CreateOrAttachShmemStructs(void)
{
- /*
- * Set up LWLocks. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
dsm_shmem_init();
DSMRegistryShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index e91c47d4d97..5c7caf59360 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,43 +19,110 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * NOTES:
- * (a) There are three kinds of shared memory data structures
- * available to POSTGRES: fixed-size structures, queues and hash
- * tables. Fixed-size structures contain things like global variables
- * for a module and should never be allocated after the shared memory
- * initialization phase. Hash tables have a fixed maximum size and
- * cannot grow beyond that. Queues link data structures
- * that have been allocated either within fixed-size structures or as hash
- * buckets. Each shared data structure has a string name to identify
- * it (assigned in the module that declares it).
- *
- * (b) During initialization, each module looks for its
- * shared data structures in a hash table called the "Shmem Index".
- * If the data structure is not present, the caller can allocate
- * a new one and initialize it. If the data structure is present,
- * the caller "attaches" to the structure by initializing a pointer
- * in the local address space.
- * The shmem index has two purposes: first, it gives us
- * a simple model of how the world looks when a backend process
- * initializes. If something is present in the shmem index,
- * it is initialized. If it is not, it is uninitialized. Second,
- * the shmem index allows us to allocate shared memory on demand
- * instead of trying to preallocate structures and hard-wire the
- * sizes and locations in header files. If you are using a lot
- * of shared memory in a lot of different places (and changing
- * things during development), this is important.
- *
- * (c) In standard Unix-ish environments, individual backends do not
- * need to re-establish their local pointers into shared memory, because
- * they inherit correct values of those variables via fork() from the
- * postmaster. However, this does not work in the EXEC_BACKEND case.
- * In ports using EXEC_BACKEND, new backends have to set up their local
- * pointers using the method described in (b) above.
- *
- * (d) memory allocation model: shared memory can never be
- * freed, once allocated. Each hash table has its own free list,
- * so hash buckets can be reused when an item is deleted.
+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified when it is
+ * requested. shmem_hash.c provides a shared hash table implementation on top
+ * of that.
+ *
+ * Shared memory areas should usually not be allocated after postmaster
+ * startup, although we do allow small allocations later for the benefit of
+ * extension modules that are loaded after startup. Despite that allowance,
+ * extensions that need shared memory should be added in
+ * shared_preload_libraries, because the allowance is quite small and there is
+ * no guarantee that any memory is available after startup.
+ *
+ * Nowadays, there is also another way to allocate shared memory called
+ * Dynamic Shared Memory. See dsm.c for that facility. One big difference
+ * between traditional shared memory handled by shmem.c and dynamic shared
+ * memory is that traditional shared memory areas are mapped to the same
+ * address in all processes, so you can use normal pointers in shared memory
+ * structs. With Dynamic Shared Memory, you must use offsets or DSA pointers
+ * instead.
+ *
+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted.
+ *
+ * Usage
+ * -----
+ *
+ * To allocate shared memory, you need to register a set of callback functions
+ * which handle the lifecycle of the allocation. In the request_fn callback,
+ * fill in a ShmemRequestStructOpts struct with the name, size, and any other
+ * options, and call ShmemRequestStruct(). Leave any unused fields as zeros.
+ *
+ * typedef struct MyShmemData {
+ * ...
+ * } MyShmemData;
+ *
+ * static MyShmemData *MyShmem;
+ *
+ * static void my_shmem_request(void *arg);
+ * static void my_shmem_init(void *arg);
+ *
+ * const ShmemCallbacks MyShmemCallbacks = {
+ * .request_fn = my_shmem_request,
+ * .init_fn = my_shmem_init,
+ * };
+ *
+ * static void
+ * my_shmem_request(void *arg)
+ * {
+ * static ShmemStructDesc MyShmemDesc;
+ *
+ * ShmemRequestStruct(&MyShmemDesc, &(ShmemRequestStructOpts) {
+ * .name = "My shmem area",
+ * .size = sizeof(MyShmemData),
+ * .ptr = (void **) &MyShmem,
+ * });
+ * }
+ *
+ * Register the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks)
+ * in the extension's _PG_init() function.
+ *
+ * Lifecycle
+ * ---------
+ *
+ * Initializing shared memory happens in multiple phases. In the first phase,
+ * during postmaster startup, all the request_fn callbacks are called. Only
+ * after all the request_fn callbacks have been called and all the shmem areas
+ * have been requested by the ShmemRequestStruct() calls we know how much
+ * shared memory we need in total. After that, postmaster allocates global
+ * shared memory segment, and calls all the init_fn callbacks to initialize
+ * all the requested shmem areas.
+ *
+ * In standard Unix-ish environments, individual backends do not need to
+ * re-establish their local pointers into shared memory, because they inherit
+ * correct values of those variables via fork() from the postmaster. However,
+ * this does not work in the EXEC_BACKEND case. In ports using EXEC_BACKEND,
+ * backend startup also calls the shmem_request callbacks to re-establish the
+ * knowledge about each shared memory area, sets the pointer variables
+ * (*ShmemStructDesc->ptr), and calls the attach_fn callback, if any, for
+ * additional per-backend setup.
+ *
+ * Legacy ShmemInitStruct()/ShmemInitHash() functions
+ * --------------------------------------------------
+ *
+ * ShmemInitStruct()/ShmemInitHash() is another way of registering shmem
+ * areas. It pre-dates the ShmemRequestStruct()/ShmemRequestHash() functions,
+ * and should not be used in new code, but as of this writing it is still
+ * widely used in extensions.
+ *
+ * To allocate a shmem area with ShmemInitStruct(), you need to separately
+ * register the size needed for the area by calling RequestAddinShmemSpace()
+ * from the extension's shmem_request_hook, and allocate the area by calling
+ * ShmemInitStruct() from the extension's shmem_startup_hook. There are no
+ * init/attach callbacks. Instead, the caller of ShmemInitStruct() must check
+ * the return status of ShmemInitStruct() and initialize the struct if it was
+ * not previously initialized.
+ *
+ * Calling ShmemAlloc() directly
+ * -----------------------------
+ *
+ * There's a more low-level way of allocating shared memory too: you can call
+ * ShmemAlloc() directly. It's used to implement the higher level mechanisms,
+ * and should generally not be called directly.
*/
#include "postgres.h"
@@ -75,6 +142,75 @@
#include "utils/builtins.h"
#include "utils/tuplestore.h"
+/*
+ * Registered callbacks.
+ *
+ * During postmaster startup, we accumulate the callbacks from all subsystems
+ * in this list.
+ *
+ * This is in process private memory, although on Unix-like systems, we expect
+ * all the registrations to happen at postmaster startup time and be inherited
+ * by all the child processes via fork().
+ */
+static List *registered_shmem_callbacks;
+
+/*
+ * In the shmem request phase, all the shmem areas requested with the
+ * ShmemRequest*() functions are accumulated here.
+ */
+typedef struct
+{
+ ShmemStructOpts *options;
+ ShmemRequestKind kind;
+} ShmemRequest;
+
+static List *pending_shmem_requests;
+
+/*
+ * Per-process state machine, for sanity checking that we do things in the
+ * right order.
+ *
+ * Postmaster:
+ * INITIAL -> REQUESTING -> INITIALIZING -> DONE
+ *
+ * Backends in EXEC_BACKEND mode:
+ * INITIAL -> REQUESTING -> ATTACHING -> DONE
+ *
+ * Late request:
+ * DONE -> REQUESTING -> AFTER_STARTUP_ATTACH_OR_INIT -> DONE
+ */
+enum shmem_request_state
+{
+ /* Initial state */
+ SRS_INITIAL,
+
+ /*
+ * When we start calling the shmem_request callbacks, we enter the
+ * SRS_REQUESTING phase. All ShmemRequestStruct calls happen in this
+ * state.
+ */
+ SRS_REQUESTING,
+
+ /*
+ * Postmaster has finished all shmem requests, and is now initializing the
+ * shared memory segment. init_fn callbacks are called in this state.
+ */
+ SRS_INITIALIZING,
+
+ /*
+ * A postmaster child process is starting up. attach_fn callbacks are
+ * called in this state.
+ */
+ SRS_ATTACHING,
+
+ /* An after-startup allocation or attachment is in progress. */
+ SRS_AFTER_STARTUP_ATTACH_OR_INIT,
+
+ /* Normal state after shmem initialization / attachment */
+ SRS_DONE,
+};
+static enum shmem_request_state shmem_request_state = SRS_INITIAL;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -106,25 +242,381 @@ static void *ShmemBase; /* start address of shared memory */
static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+/*
+ * ShmemIndex is a global directory of shmem areas, itself also stored in the
+ * shared memory.
+ */
+static HTAB *ShmemIndex;
+
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+
+/*
+ * # of additional entries to reserve in the shmem index table, for
+ * allocations after postmaster startup. (This is not a hard limit, the hash
+ * table can grow larger than that if there is shared memory available)
+ */
+#define SHMEM_INDEX_ADDITIONAL_SIZE (128)
+
+/* this is a hash bucket in the shmem index table */
+typedef struct
+{
+ char key[SHMEM_INDEX_KEYSIZE]; /* string name */
+ void *location; /* location in shared mem */
+ Size size; /* # bytes requested for the structure */
+ Size allocated_size; /* # bytes actually allocated */
+} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
+static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
+static void InitShmemIndexEntry(ShmemRequest *request);
+static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+
Datum pg_numa_available(PG_FUNCTION_ARGS);
+/*
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.
+ *
+ * This does not yet allocate the memory, but merely registers the need for
+ * it. The actual allocation happens later in the postmaster startup
+ * sequence.
+ *
+ * This must be called from a shmem_request callback function, registered with
+ * RegisterShmemCallbacks(). This enforces a coding pattern that works the
+ * same in normal Unix systems and with EXEC_BACKEND. On Unix systems, the
+ * shmem_request callbacks are called once, early in postmaster startup, and
+ * the child processes inherit the struct descriptors and any other
+ * per-process state from the postmaster. In EXEC_BACKEND mode, shmem_request
+ * callbacks are *also* called in each backend, at backend startup, to
+ * re-establish the struct descriptors. By calling the same function in both
+ * cases, we ensure that all the shmem areas are registered the same way in
+ * all processes.
+ *
+ * 'desc' is a backend-private handle for the shared memory area.
+ *
+ * 'options' defines the name and size of the area, and any other optional
+ * features. Leave unused options as zeros. The options are copied to
+ * longer-lived memory, so it doesn't need to live after the
+ * ShmemRequestStruct() call and can point to a local variable in the calling
+ * function. The 'name' must point to a long-lived string though, only the
+ * pointer to it is copied.
+ */
+void
+ShmemRequestStructWithOpts(const ShmemStructOpts *options)
+{
+ ShmemStructOpts *options_copy;
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemStructOpts));
+ memcpy(options_copy, options, sizeof(ShmemStructOpts));
+
+ ShmemRequestInternal(options_copy, SHMEM_KIND_STRUCT);
+}
+
+/*
+ * Internal workhorse of ShmemRequestStruct() and ShmemRequestHash().
+ *
+ * Note: 'desc' and 'options' must live until the init/attach callbacks have
+ * been called. Unlike in the public ShmemRequestStruct() and
+ * ShmemRequestHash() functions, 'options' is *not* copied. This allows
+ * ShmemRequestHash() to pass a pointer to the extended ShmemRequestHashOpts
+ * struct instead.
+ */
+void
+ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
+{
+ ShmemRequest *request;
+
+ if (options->name == NULL)
+ elog(ERROR, "shared memory request is missing 'name' option");
+
+ if (IsUnderPostmaster)
+ {
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+ else
+ {
+ if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->size <= 0)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ }
+
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
+ /* Check that it's not already registered in this process */
+ foreach_ptr(ShmemRequest, existing, pending_shmem_requests)
+ {
+ if (strcmp(existing->options->name, options->name) == 0)
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" is already registered",
+ options->name)));
+ }
+
+ /* Request looks valid, remember it */
+ request = palloc(sizeof(ShmemRequest));
+ request->options = options;
+ request->kind = kind;
+ pending_shmem_requests = lappend(pending_shmem_requests, request);
+}
+
+/*
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * This is called at postmaster startup, before the shared memory segment has
+ * been created.
+ */
+size_t
+ShmemGetRequestedSize(void)
+{
+ size_t size;
+
+ /* memory needed for the ShmemIndex */
+ size = hash_estimate_size(list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE,
+ sizeof(ShmemIndexEnt));
+ size = CACHELINEALIGN(size);
+
+ /* memory needed for all the requested areas */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ size = add_size(size, request->options->size);
+ /* calculate alignment padding like ShmemAllocRaw() does */
+ size = CACHELINEALIGN(size);
+ }
+
+ return size;
+}
+
+/*
+ * ShmemInitRequested() --- allocate and initialize requested shared memory
+ * structures.
+ *
+ * This is called once at postmaster startup, after the shared memory segment
+ * has been created.
+ */
+void
+ShmemInitRequested(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+ Assert(shmem_request_state == SRS_INITIALIZING);
+
+ /*
+ * Initialize the ShmemIndex entries and perform basic initialization of
+ * all the requested memory areas. There are no concurrent processes yet,
+ * so no need for locking.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ InitShmemIndexEntry(request);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /*
+ * Call the subsystem-specific init callbacks to finish initialization of
+ * all the areas.
+ */
+ foreach_ptr(const ShmemCallbacks, callbacks, registered_shmem_callbacks)
+ {
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->opaque_arg);
+ }
+
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Re-establish process private state related to shmem areas.
+ *
+ * This is called at backend startup in EXEC_BACKEND mode, in every backend.
+ */
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRequested(void)
+{
+ ListCell *lc;
+
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_ATTACHING;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+
+ /*
+ * Attach to all the requested memory areas.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ AttachShmemIndexEntry(request, false);
+ }
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
+
+ /* Call attach callbacks */
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
+
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+
+ shmem_request_state = SRS_DONE;
+}
+#endif
+
+/*
+ * Insert requested shmem area into the shared memory index and initialize it.
+ *
+ * Note that this only does performs basic initialization depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific init callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
+ */
+static void
+InitShmemIndexEntry(ShmemRequest *request)
+{
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+ bool found;
+ size_t allocated_size;
+ void *structPtr;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_ENTER_NULL, &found);
+ if (found)
+ elog(ERROR, "shared memory struct \"%s\" is already initialized", name);
+ if (!index_entry)
+ {
+ /* tried to add it to the hash table, but there was no space */
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ name)));
+ }
+
+ /*
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of shared memory for it, and initialize the index entry.
+ */
+ structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ name, request->options->size)));
+ }
+ index_entry->size = request->options->size;
+ index_entry->allocated_size = allocated_size;
+ index_entry->location = structPtr;
+
+ /* Initialize depending on the kind of shmem area it is */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_init(structPtr, request->options);
+ break;
+ }
+}
+
+/*
+ * Look up a named shmem area in the shared memory index and attach to it.
+ *
+ * Note that this only performs the basic attachment actions depending on
+ * ShmemRequestKind, like setting the global pointer variable to the area for
+ * SHMEM_KIND_STRUCT or setting up the backend-private HTAB control struct.
+ * This does *not* call the subsystem-specific attach callbacks. That's done
+ * later after all the shmem areas have been initialized or attached to.
+ */
+static bool
+AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
+{
+ const char *name = request->options->name;
+ ShmemIndexEnt *index_entry;
+
+ /* look it up in the shmem index */
+ index_entry = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, NULL);
+ if (!index_entry)
+ {
+ if (!missing_ok)
+ ereport(ERROR,
+ (errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ request->options->name)));
+ return false;
+ }
+
+ /* Check that the size in the index matches the request. */
+ if (index_entry->size != request->options->size &&
+ request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different size: existing %zu, requested %zu",
+ name, index_entry->size, request->options->size)));
+ }
+
+ /*
+ * Re-establish the caller's pointer variable, or do other actions to
+ * attach depending on the kind of shmem area it is.
+ */
+ switch (request->kind)
+ {
+ case SHMEM_KIND_STRUCT:
+ if (request->options->ptr)
+ *(request->options->ptr) = index_entry->location;
+ break;
+ case SHMEM_KIND_HASH:
+ shmem_hash_attach(index_entry->location, request->options);
+ break;
+ }
+
+ return true;
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
* Called at postmaster or stand-alone backend startup, to initialize the
* allocator's data structure in the shared memory segment. In EXEC_BACKEND,
- * this is also called at backend startup, to set up pointers to the shared
- * memory areas.
+ * this is also called at backend startup, to set up pointers to the
+ * already-initialized data structure.
*/
void
InitShmemAllocator(PGShmemHeader *seghdr)
{
Size offset;
+ int64 hash_nelems;
HASHCTL info;
int hash_flags;
@@ -133,6 +625,16 @@ InitShmemAllocator(PGShmemHeader *seghdr)
#endif
Assert(seghdr != NULL);
+ if (IsUnderPostmaster)
+ {
+ Assert(shmem_request_state == SRS_INITIAL);
+ }
+ else
+ {
+ Assert(shmem_request_state == SRS_REQUESTING);
+ shmem_request_state = SRS_INITIALIZING;
+ }
+
/*
* We assume the pointer and offset are MAXALIGN. Not a hard requirement,
* but it's true today and keeps the math below simpler.
@@ -177,19 +679,21 @@ InitShmemAllocator(PGShmemHeader *seghdr)
* use ShmemInitHash() here because it relies on ShmemIndex being already
* initialized.
*/
+ hash_nelems = list_length(pending_shmem_requests) + SHMEM_INDEX_ADDITIONAL_SIZE;
+
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
hash_flags = HASH_ELEM | HASH_STRINGS | HASH_FIXED_SIZE;
if (!IsUnderPostmaster)
{
- ShmemAllocator->index_size = hash_estimate_size(SHMEM_INDEX_SIZE, info.entrysize);
+ ShmemAllocator->index_size = hash_estimate_size(hash_nelems, info.entrysize);
ShmemAllocator->index = (HASHHDR *) ShmemAlloc(ShmemAllocator->index_size);
}
ShmemIndex = shmem_hash_create(ShmemAllocator->index,
ShmemAllocator->index_size,
IsUnderPostmaster,
- "ShmemIndex", SHMEM_INDEX_SIZE,
+ "ShmemIndex", hash_nelems,
&info, hash_flags);
Assert(ShmemIndex != NULL);
@@ -210,6 +714,23 @@ InitShmemAllocator(PGShmemHeader *seghdr)
}
}
+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{
+ Assert(!IsUnderPostmaster);
+ shmem_request_state = SRS_INITIAL;
+
+ pending_shmem_requests = NIL;
+
+ /*
+ * Note that we don't clear the registered callbacks. We will need to
+ * call them again as we restart
+ */
+}
+
/*
* ShmemAlloc -- allocate max-aligned chunk from shared memory
*
@@ -307,92 +828,191 @@ ShmemAddrIsValid(const void *addr)
}
/*
- * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ * Register callbacks that define a shared memory area (or multiple areas).
*
- * This is called during initialization to find or allocate
- * a data structure in shared memory. If no other process
- * has created the structure, this routine allocates space
- * for it. If it exists already, a pointer to the existing
- * structure is returned.
+ * The system will call the callbacks at different stages of postmaster or
+ * backend startup, to allocate and initialize the area.
*
- * Returns: pointer to the object. *foundPtr is set true if the object was
- * already in the shmem index (hence, already initialized).
+ * This is normally called early during postmaster startup, but if the
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP is set, this can also be used after
+ * startup, although after startup there's no guarantee that there's enough
+ * shared memory available. When called after startup, this immediately calls
+ * the right callbacks depending on whether another backend had already
+ * initialized the area.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: In EXEC_BACKEND mode, this needs to be called in every backend
+ * process. That's needed because we cannot pass down the callback function
+ * pointers from the postmaster process, because different processes may have
+ * loaded libraries to different addresses.
*/
-void *
-ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
{
- ShmemIndexEnt *result;
- void *structPtr;
+ if (shmem_request_state == SRS_DONE && IsUnderPostmaster)
+ {
+ /*
+ * After-startup initialization or attachment. Call the appropriate
+ * callbacks immmediately.
+ */
+ if ((callbacks->flags & SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP) == 0)
+ elog(ERROR, "cannot request shared memory at this time");
- Assert(ShmemIndex != NULL);
+ CallShmemCallbacksAfterStartup(callbacks);
+ }
+ else
+ {
+ /* Remember the callbacks for later */
+ registered_shmem_callbacks = lappend(registered_shmem_callbacks,
+ (void *) callbacks);
+ }
+}
+/*
+ * Register a shmem area (or multiple areas) after startup.
+ */
+static void
+CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
+{
+ bool found_any;
+ bool notfound_any;
+
+ Assert(shmem_request_state == SRS_DONE);
+ shmem_request_state = SRS_REQUESTING;
+
+ /*
+ * Call the request callback first. The callback make ShmemRequest*()
+ * calls for each shmem area, adding them to pending_shmem_requests.
+ */
+ Assert(pending_shmem_requests == NIL);
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->opaque_arg);
+ shmem_request_state = SRS_AFTER_STARTUP_ATTACH_OR_INIT;
+
+ if (pending_shmem_requests == NIL)
+ {
+ shmem_request_state = SRS_DONE;
+ return;
+ }
+
+ /* Hold ShmemIndexLock while we allocate all the shmem entries */
LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
- /* look it up in the shmem index */
- result = (ShmemIndexEnt *)
- hash_search(ShmemIndex, name, HASH_ENTER_NULL, foundPtr);
+ /*
+ * Check if the requested shared memory areas have already been
+ * initialized. We assume all the areas requested by the request callback
+ * to form a coherent unit such that they're all already initialized or
+ * none. Otherwise it would be ambiguous which callback, init or attach,
+ * to callback afterwards.
+ */
+ found_any = notfound_any = false;
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
+ {
+ if (hash_search(ShmemIndex, request->options->name, HASH_FIND, NULL))
+ found_any = true;
+ else
+ notfound_any = true;
+ }
+ if (found_any && notfound_any)
+ elog(ERROR, "found some but not all");
- if (!result)
+ /*
+ * Allocate or attach all the shmem areas requested by the request_fn
+ * callback.
+ */
+ foreach_ptr(ShmemRequest, request, pending_shmem_requests)
{
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("could not create ShmemIndex entry for data structure \"%s\"",
- name)));
+ if (found_any)
+ AttachShmemIndexEntry(request, false);
+ else
+ InitShmemIndexEntry(request);
}
+ list_free_deep(pending_shmem_requests);
+ pending_shmem_requests = NIL;
- if (*foundPtr)
+ /* Finish by calling the appropriate subsystem-specific callback */
+ if (found_any)
{
- /*
- * Structure is in the shmem index so someone else has allocated it
- * already. The size better be the same as the size we are trying to
- * initialize to, or there is a name conflict (or worse).
- */
- if (result->size != size)
- {
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errmsg("ShmemIndex entry size is wrong for data structure"
- " \"%s\": expected %zu, actual %zu",
- name, size, result->size)));
- }
- structPtr = result->location;
+ if (callbacks->attach_fn)
+ callbacks->attach_fn(callbacks->opaque_arg);
}
else
{
- Size allocated_size;
-
- /* It isn't in the table yet. allocate and initialize it */
- structPtr = ShmemAllocRaw(size, &allocated_size);
- if (structPtr == NULL)
- {
- /* out of memory; remove the failed ShmemIndex entry */
- hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
- LWLockRelease(ShmemIndexLock);
- ereport(ERROR,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
- " \"%s\" (%zu bytes requested)",
- name, size)));
- }
- result->size = size;
- result->allocated_size = allocated_size;
- result->location = structPtr;
+ if (callbacks->init_fn)
+ callbacks->init_fn(callbacks->opaque_arg);
}
LWLockRelease(ShmemIndexLock);
+ shmem_request_state = SRS_DONE;
+}
+
+/*
+ * Call all shmem request callbacks.
+ */
+void
+ShmemCallRequestCallbacks(void)
+{
+ ListCell *lc;
- Assert(ShmemAddrIsValid(structPtr));
+ Assert(shmem_request_state == SRS_INITIAL);
+ shmem_request_state = SRS_REQUESTING;
- Assert(structPtr == (void *) CACHELINEALIGN(structPtr));
+ foreach(lc, registered_shmem_callbacks)
+ {
+ const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);
- return structPtr;
+ if (callbacks->request_fn)
+ callbacks->request_fn(callbacks->opaque_arg);
+ }
}
+/*
+ * ShmemInitStruct -- Create/attach to a structure in shared memory.
+ *
+ * This is called during initialization to find or allocate
+ * a data structure in shared memory. If no other process
+ * has created the structure, this routine allocates space
+ * for it. If it exists already, a pointer to the existing
+ * structure is returned.
+ *
+ * Returns: pointer to the object. *foundPtr is set true if the object was
+ * already in the shmem index (hence, already initialized).
+ *
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestStruct() in new code!
+ */
+void *
+ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+{
+ void *ptr = NULL;
+ ShmemStructOpts options = {
+ .name = name,
+ .size = size,
+ .ptr = &ptr,
+ };
+ ShmemRequest request = {&options, SHMEM_KIND_STRUCT};
+
+ Assert(shmem_request_state == SRS_DONE ||
+ shmem_request_state == SRS_INITIALIZING ||
+ shmem_request_state == SRS_REQUESTING);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ /*
+ * During postmaster startup, look up the existing entry if any.
+ */
+ *foundPtr = false;
+ if (IsUnderPostmaster)
+ *foundPtr = AttachShmemIndexEntry(&request, true);
+
+ /* Initialize it if not found */
+ if (!*foundPtr)
+ InitShmemIndexEntry(&request);
+
+ LWLockRelease(ShmemIndexLock);
+
+ Assert(ptr != NULL);
+ return ptr;
+}
/*
* Add two Size values, checking for overflow
diff --git a/src/backend/storage/ipc/shmem_hash.c b/src/backend/storage/ipc/shmem_hash.c
index 721dec5c17f..c28d673cbd2 100644
--- a/src/backend/storage/ipc/shmem_hash.c
+++ b/src/backend/storage/ipc/shmem_hash.c
@@ -7,10 +7,8 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* A shared memory hash table implementation on top of the named, fixed-size
- * shared memory areas managed by shmem.c. Hash tables have a fixed maximum
- * size, but their actual size can vary dynamically. When entries are added
- * to the table, more space is allocated. Each shared data structure and hash
- * has a string name to identify it.
+ * shared memory areas managed by shmem.c. Each hash table has its own free
+ * list, so hash buckets can be reused when an item is deleted.
*
* IDENTIFICATION
* src/backend/storage/ipc/shmem_hash.c
@@ -22,6 +20,7 @@
#include "storage/shmem.h"
#include "storage/shmem_internal.h"
+#include "utils/memutils.h"
/*
* A very simple allocator used to carve out different parts of a hash table
@@ -35,6 +34,66 @@ typedef struct shmem_hash_allocator
static void *ShmemHashAlloc(Size size, void *alloc_arg);
+/*
+ * ShmemRequestHash -- Request a shared memory hash table.
+ *
+ * Similar to ShmemRequestStruct(), but requests a hash table instead of an
+ * opaque area.
+ */
+void
+ShmemRequestHashWithOpts(const ShmemHashOpts *options)
+{
+ ShmemHashOpts *options_copy;
+
+ Assert(options->name != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(ShmemHashOpts));
+ memcpy(options_copy, options, sizeof(ShmemHashOpts));
+
+ /* Set options for the fixed-size area holding the hash table */
+ options_copy->base.name = options->name;
+ options_copy->base.size = hash_estimate_size(options_copy->nelems,
+ options_copy->hash_info.entrysize);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_HASH);
+}
+
+void
+shmem_hash_init(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ options->hash_info.hctl = location;
+ htab = shmem_hash_create(location, options->base.size, false,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
+void
+shmem_hash_attach(void *location, ShmemStructOpts *base_options)
+{
+ ShmemHashOpts *options = (ShmemHashOpts *) base_options;
+ int hash_flags = options->hash_flags;
+ HTAB *htab;
+
+ /* attach to it rather than allocate and initialize new space */
+ hash_flags |= HASH_ATTACH;
+ options->hash_info.hctl = location;
+ Assert(options->hash_info.hctl != NULL);
+ htab = shmem_hash_create(location, options->base.size, true,
+ options->name,
+ options->nelems, &options->hash_info, hash_flags);
+
+ if (options->ptr)
+ *options->ptr = htab;
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
@@ -51,9 +110,8 @@ static void *ShmemHashAlloc(Size size, void *alloc_arg);
* to shared-memory hash tables are added here, except that callers may
* choose to specify HASH_PARTITION.
*
- * Note: before Postgres 9.0, this function returned NULL for some failure
- * cases. Now, it always throws error instead, so callers need not check
- * for NULL.
+ * Note: This is a legacy interface, kept for backwards compatibility with
+ * extensions. Use ShmemRequestHash() in new code!
*/
HTAB *
ShmemInitHash(const char *name, /* table string name for shmem index */
@@ -67,7 +125,14 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
size = hash_estimate_size(nelems, infoP->entrysize);
- /* look it up in the shmem index or allocate */
+ /*
+ * Look it up in the shmem index or allocate.
+ *
+ * NOTE: The area is requested internally as SHMEM_KIND_STRUCT instead of
+ * SHMEM_KIND_HASH. That's correct because we do the hash table
+ * initialization by calling shmem_hash_create() ourselves. (We don't
+ * expose the request kind to users; if we did, that would be confusing.)
+ */
location = ShmemInitStruct(name, size, &found);
return shmem_hash_create(location, size, found,
@@ -77,8 +142,8 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
/*
* Initialize or attach to a shared hash table in the given shmem region.
*
- * This is extracted from ShmemInitHash() to allow InitShmemAllocator() to
- * share the logic for bootstrapping the ShmemIndex hash table.
+ * This is exposed to allow InitShmemAllocator() to share the logic for
+ * bootstrapping the ShmemIndex hash table.
*/
HTAB *
shmem_hash_create(void *location, size_t size, bool found,
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5c47cf13473..9b880a6af65 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -121,6 +121,9 @@ FastPathLockShmemSize(void)
size = add_size(size, mul_size(TotalProcs, (fpLockBitsSize + fpRelIdSize)));
+ Assert(TotalProcs > 0);
+ Assert(size > 0);
+
return size;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 10be60011ad..93851269e43 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -67,6 +67,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procsignal.h"
+#include "storage/shmem_internal.h"
#include "storage/sinval.h"
#include "storage/standby.h"
#include "tcop/backend_startup.h"
@@ -4155,7 +4156,14 @@ PostgresSingleUserMain(int argc, char *argv[],
InitializeFastPathLocks();
/*
- * Give preloaded libraries a chance to request additional shared memory.
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
+ /*
+ * Also call any legacy shmem request hooks that might'be been installed
+ * by preloaded libraries.
*/
process_shmem_requests();
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index d49a7a92c64..cb95ad4ef2a 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -119,8 +119,8 @@
* chosen at creation based on the initial number of elements, so even though
* we support allocating more elements later, performance will suffer if the
* table grows much beyond the initial size. (Currently, shared memory hash
- * tables are only created by ShmemInitHash() though, which doesn't support
- * growing at all.)
+ * tables are only created by ShmemRequestHash()/ShmemInitHash() though, which
+ * doesn't support growing at all.)
*/
#define HASH_SEGSIZE 256
#define HASH_SEGSIZE_SHIFT 8 /* must be log2(HASH_SEGSIZE) */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 81d05381f8f..8e0fc29dcac 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -28,21 +28,167 @@
#include "utils/hsearch.h"
+/*
+ * Options for ShmemRequestStruct()
+ *
+ * 'name' and 'size' are required. Initialize any optional fields that you
+ * don't use to zeros.
+ *
+ * After registration, the shmem machinery reserves memory for the area, sets
+ * '*ptr' to point to the allocation, and calls the callbacks at the right
+ * moments.
+ */
+typedef struct ShmemStructOpts
+{
+ const char *name;
-/* shmem.c */
-extern void *ShmemAlloc(Size size);
-extern void *ShmemAllocNoError(Size size);
+ /*
+ * Requested size of the shmem allocation.
+ *
+ * When attaching to an existing allocation, the size must match the size
+ * given when the shmem region was allocated. This cross-check can be
+ * disabled specifying SHMEM_ATTACH_UNKNOWN_SIZE.
+ */
+ ssize_t size;
+
+ /*
+ * When the shmem area is initialized or attached to, pointer to it is
+ * stored in *ptr. It usually points to a global variable, used to access
+ * the shared memory area later. *ptr is set before the init_fn or
+ * attach_fn callback is called.
+ */
+ void **ptr;
+} ShmemStructOpts;
+
+#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+
+/*
+ * Options for ShmemRequestHash()
+ *
+ * Each hash table is backed by an allocated area, but if 'max_size' is
+ * greater than 'init_size', it can also grow beyond the initial allocated
+ * area by allocating more hash entries from the global unreserved space.
+ */
+typedef struct ShmemHashOpts
+{
+ ShmemStructOpts base;
+
+ /*
+ * Name of the shared memory area. Required. Must be unique across the
+ * system.
+ */
+ const char *name;
+
+ /*
+ * 'nelems' is the max number of elements for the hash table.
+ */
+ int64 nelems;
+
+ /*
+ * Hash table options passed to hash_create()
+ *
+ * hash_info and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values
+ * specific to shared-memory hash tables are added implicitly in
+ * ShmemRequestHash(), except that callers may choose to specify
+ * HASH_PARTITION and/or HASH_FIXED_SIZE.
+ */
+ HASHCTL hash_info;
+ int hash_flags;
+
+ /*
+ * When the hash table is initialized or attached to, pointer to its
+ * backend-private handle is stored in *ptr. It usually points to a
+ * global variable, used to access the hash table later.
+ */
+ HTAB **ptr;
+} ShmemHashOpts;
+
+typedef void (*ShmemRequestCallback) (void *opaque_arg);
+typedef void (*ShmemInitCallback) (void *opaque_arg);
+typedef void (*ShmemAttachCallback) (void *opaque_arg);
+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_CALLBACKS_* flags */
+ int flags;
+
+ /*
+ * 'request_fn' is called during postmaster startup, before the shared
+ * memory has been allocated. The function should call
+ * RequestShmemStruct() and RequestShmemHash() to register the subsystem's
+ * shared memory needs.
+ */
+ ShmemRequestCallback request_fn;
+
+ /*
+ * Initialization callback function. This is called after the shared
+ * memory area has been allocated, usually at postmaster startup.
+ */
+ ShmemInitCallback init_fn;
+
+ /*
+ * Attachment callback function. In EXEC_BACKEND mode, this is called at
+ * startup of each backend. In !EXEC_BACKEND mode, this is only called if
+ * the shared memory area is registered after postmaster startup (see
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP).
+ */
+ ShmemAttachCallback attach_fn;
+
+ /*
+ * Argument passed to the callbacks. This is opaque to the shmem system,
+ * callbacks can use it for their own purposes.
+ */
+ void *opaque_arg;
+} ShmemCallbacks;
+
+/*
+ * Flags to control the behavior of RegisterShmemCallbacks().
+ *
+ * SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP: Normally, calling
+ * RegisterShmemCallbacks() after postmaster startup, e.g. in an add-in
+ * library loaded on-demand in a backend, results in an error, because shared
+ * memory should generally be requested at postmaster startup time. But if
+ * this flag is set, it is allowed and the callbacks are called immediately to
+ * initialize or attach to the requested shared memory areas. This is not
+ * used by any built-in subsystems, but extensions may find it useful.
+ */
+#define SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP 0x00000001
+
+extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+
+/*
+ * These macros provide syntactic sugar for calling the underlying functions
+ * with named arguments -like syntax.
+ */
+#define ShmemRequestStruct(...) \
+ ShmemRequestStructWithOpts(&(ShmemStructOpts){__VA_ARGS__})
+
+#define ShmemRequestHash(...) \
+ ShmemRequestHashWithOpts(&(ShmemHashOpts){__VA_ARGS__})
+
+extern void ShmemRequestStructWithOpts(const ShmemStructOpts *options);
+extern void ShmemRequestHashWithOpts(const ShmemHashOpts *options);
+
+/* legacy shmem allocation functions */
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+extern HTAB *ShmemInitHash(const char *name, int64 nelems,
+ HASHCTL *infoP, int hash_flags);
+extern void *ShmemAlloc(Size size);
+extern void *ShmemAllocNoError(Size size);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
-/* shmem_hash.c */
-extern HTAB *ShmemInitHash(const char *name, int64 nelems,
- HASHCTL *infoP, int hash_flags);
-
/* ipci.c */
extern void RequestAddinShmemSpace(Size size);
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
index e0638135639..fe12bf33439 100644
--- a/src/include/storage/shmem_internal.h
+++ b/src/include/storage/shmem_internal.h
@@ -16,26 +16,37 @@
#include "storage/shmem.h"
#include "utils/hsearch.h"
+/* Different kinds of shmem areas. */
+typedef enum
+{
+ SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
+ SHMEM_KIND_HASH, /* a hash table */
+} ShmemRequestKind;
+
+/* shmem.c */
typedef struct PGShmemHeader PGShmemHeader; /* avoid including
* storage/pg_shmem.h here */
+extern void ShmemCallRequestCallbacks(void);
extern void InitShmemAllocator(PGShmemHeader *seghdr);
+#ifdef EXEC_BACKEND
+extern void AttachShmemAllocator(PGShmemHeader *seghdr);
+#endif
+extern void ResetShmemAllocator(void);
-extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
- const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind);
-/* size constants for the shmem index table */
- /* max size of data structure string name */
-#define SHMEM_INDEX_KEYSIZE (48)
- /* max number of named shmem structures and hash tables */
-#define SHMEM_INDEX_SIZE (256)
+extern size_t ShmemGetRequestedSize(void);
+extern void ShmemInitRequested(void);
+#ifdef EXEC_BACKEND
+extern void ShmemAttachRequested(void);
+#endif
-/* this is a hash bucket in the shmem index table */
-typedef struct
-{
- char key[SHMEM_INDEX_KEYSIZE]; /* string name */
- void *location; /* location in shared mem */
- Size size; /* # bytes requested for the structure */
- Size allocated_size; /* # bytes actually allocated */
-} ShmemIndexEnt;
+extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
+
+/* shmem_hash.c */
+extern HTAB *shmem_hash_create(void *location, size_t size, bool found,
+ const char *name, int64 nelems, HASHCTL *infoP, int hash_flags);
+extern void shmem_hash_init(void *location, ShmemStructOpts *options);
+extern void shmem_hash_attach(void *location, ShmemStructOpts *options);
#endif /* SHMEM_INTERNAL_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0c5493bd47f..d527db824a1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2865,9 +2865,16 @@ SharedTypmodTableEntry
Sharedsort
ShellTypeInfo
ShippableCacheEntry
-ShmemAllocatorData
ShippableCacheKey
+ShmemAllocatorData
+ShmemCallbacks
ShmemIndexEnt
+ShmemHashDesc
+ShmemHashOpts
+ShmemRequest
+ShmemRequestKind
+ShmemStructDesc
+ShmemStructOpts
ShutdownForeignScan_function
ShutdownInformation
ShutdownMode
--
2.47.3
[text/x-patch] v12-0003-Add-test-module-to-test-after-startup-shmem-allo.patch (10.2K, 4-v12-0003-Add-test-module-to-test-after-startup-shmem-allo.patch)
download | inline diff:
From edd8cabea8ac8114978959fc9b2864560a63147e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:50:15 +0300
Subject: [PATCH v12 03/13] Add test module to test after-startup shmem
allocations
The old ShmemInit{Struct/Hash}() functions could be used after
postmaster statup, as long as the allocation is small enough to fit in
spare shmem reserved at startup. I believe some extensions do that,
although we hadn't really documented it. However, we didn't have any
test coverage for that usage. The new test module covers that
after-startup usage with the new ShmemRequestStruct() functions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_shmem/Makefile | 24 +++++
src/test/modules/test_shmem/meson.build | 33 ++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 49 +++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 9 ++
src/test/modules/test_shmem/test_shmem.c | 101 ++++++++++++++++++
.../modules/test_shmem/test_shmem.control | 3 +
src/tools/pgindent/typedefs.list | 1 +
9 files changed, 222 insertions(+)
create mode 100644 src/test/modules/test_shmem/Makefile
create mode 100644 src/test/modules/test_shmem/meson.build
create mode 100644 src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
create mode 100644 src/test/modules/test_shmem/test_shmem--1.0.sql
create mode 100644 src/test/modules/test_shmem/test_shmem.c
create mode 100644 src/test/modules/test_shmem/test_shmem.control
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 864b407abcf..f1b04c99969 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -48,6 +48,7 @@ SUBDIRS = \
test_resowner \
test_rls_hooks \
test_saslprep \
+ test_shmem \
test_shm_mq \
test_slru \
test_tidstore \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index e5acacd5083..fc99552d9ab 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -49,6 +49,7 @@ subdir('test_regex')
subdir('test_resowner')
subdir('test_rls_hooks')
subdir('test_saslprep')
+subdir('test_shmem')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
diff --git a/src/test/modules/test_shmem/Makefile b/src/test/modules/test_shmem/Makefile
new file mode 100644
index 00000000000..2407f7462fe
--- /dev/null
+++ b/src/test/modules/test_shmem/Makefile
@@ -0,0 +1,24 @@
+# src/test/modules/test_shmem/Makefile
+
+PGFILEDESC = "test_shmem - test code for shmem allocations"
+
+MODULE_big = test_shmem
+OBJS = \
+ $(WIN32RES) \
+ test_shmem.o
+
+EXTENSION = test_shmem
+DATA = test_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
new file mode 100644
index 00000000000..fb4bf328b8f
--- /dev/null
+++ b/src/test/modules/test_shmem/meson.build
@@ -0,0 +1,33 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+test_shmem_sources = files(
+ 'test_shmem.c',
+)
+
+if host_system == 'windows'
+ test_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_shmem',
+ '--FILEDESC', 'test_shmem - test code for shmem allocations',])
+endif
+
+test_shmem = shared_module('test_shmem',
+ test_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_shmem
+
+test_install_data += files(
+ 'test_shmem.control',
+ 'test_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_late_shmem_alloc.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
new file mode 100644
index 00000000000..c154f57682a
--- /dev/null
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -0,0 +1,49 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+###
+# Test allocating memory after startup, i.e. when the library is not
+# in shared_preload_libraries
+###
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+
+$node->safe_psql("postgres", "CREATE EXTENSION test_shmem;");
+
+# Check that the attach counter is incremented on a new connection
+my $attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+my $attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend");
+$node->stop;
+
+###
+# Test that loading via shared_preload_libraries also works
+###
+$node->append_conf('postgresql.conf', "shared_preload_libraries = 'test_shmem'");
+$node->start;
+
+# When loaded via shared_preload_libraries, the attach callback is
+# called or not, depending on whether this is an EXEC_BACKEND build.
+my $exec_backend = $node->safe_psql("postgres", "SHOW debug_exec_backend;") eq 'on';
+$attach_count1 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+$attach_count2 = $node->safe_psql("postgres", "SELECT get_test_shmem_attach_count();");
+
+if ($exec_backend)
+{
+ cmp_ok($attach_count2, '>', $attach_count1, "attach callback is called in each backend when loaded via shared_preload_libraries");
+}
+else
+{
+ ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
new file mode 100644
index 00000000000..2d01fd9256c
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -0,0 +1,9 @@
+/* src/test/modules/test_shmem/test_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_shmem" to load this file. \quit
+
+
+CREATE FUNCTION get_test_shmem_attach_count()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
new file mode 100644
index 00000000000..9bd4012b435
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -0,0 +1,101 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_shmem.c
+ * Helpers to test shmem allocation routines
+ *
+ * Test basic memory allocation in an extension module. One notable feature
+ * that is not exercised by any other module in the repository is the
+ * allocating (non-DSM) shared memory after postmaster startup.
+ *
+ * Copyright (c) 2020-2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_shmem/test_shmem.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+
+
+PG_MODULE_MAGIC;
+
+typedef struct TestShmemData
+{
+ int value;
+ bool initialized;
+ int attach_count;
+} TestShmemData;
+
+static TestShmemData *TestShmem;
+
+static bool attached_or_initialized = false;
+
+static void test_shmem_request(void *arg);
+static void test_shmem_init(void *arg);
+static void test_shmem_attach(void *arg);
+
+static const ShmemCallbacks TestShmemCallbacks = {
+ .flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP,
+ .request_fn = test_shmem_request,
+ .init_fn = test_shmem_init,
+ .attach_fn = test_shmem_attach,
+};
+
+static void
+test_shmem_request(void *arg)
+{
+ elog(LOG, "test_shmem_request callback called");
+
+ ShmemRequestStruct(.name = "test_shmem area",
+ .size = sizeof(TestShmemData),
+ .ptr = (void **) &TestShmem);
+}
+
+static void
+test_shmem_init(void *arg)
+{
+ elog(LOG, "init callback called");
+ if (TestShmem->initialized)
+ elog(ERROR, "shmem area already initialized");
+ TestShmem->initialized = true;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+static void
+test_shmem_attach(void *arg)
+{
+ elog(LOG, "test_shmem_attach callback called");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ TestShmem->attach_count++;
+
+ if (attached_or_initialized)
+ elog(ERROR, "attach or initialize already called in this process");
+ attached_or_initialized = true;
+}
+
+void
+_PG_init(void)
+{
+ elog(LOG, "test_shmem module's _PG_init called");
+ RegisterShmemCallbacks(&TestShmemCallbacks);
+}
+
+PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
+Datum
+get_test_shmem_attach_count(PG_FUNCTION_ARGS)
+{
+ if (!attached_or_initialized)
+ elog(ERROR, "shmem area not attached or initialized in this process");
+ if (!TestShmem->initialized)
+ elog(ERROR, "shmem area not yet initialized");
+ PG_RETURN_INT32(TestShmem->attach_count);
+}
diff --git a/src/test/modules/test_shmem/test_shmem.control b/src/test/modules/test_shmem/test_shmem.control
new file mode 100644
index 00000000000..f2f26f4537a
--- /dev/null
+++ b/src/test/modules/test_shmem/test_shmem.control
@@ -0,0 +1,3 @@
+comment = 'Test code for shmem allocations'
+default_version = '1.0'
+module_pathname = '$libdir/test_shmem'
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d527db824a1..349cf4e9f12 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3148,6 +3148,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestShmemData
TestSpec
TestValueType
TextFreq
--
2.47.3
[text/x-patch] v12-0004-Convert-pg_stat_statements-to-use-the-new-shmem-.patch (11.3K, 5-v12-0004-Convert-pg_stat_statements-to-use-the-new-shmem-.patch)
download | inline diff:
From 341ff483d5fa37b3a92b3b08f114e87fef19d988 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:52:20 +0300
Subject: [PATCH v12 04/13] Convert pg_stat_statements to use the new shmem
allocation functions
As part of this, embed the LWLock it needs in the shared memory struct
itself, so that we don't need to use RequestNamedLWLockTranche()
anymore. LWLockNewTrancheId+LWLockInitialize is more convenient to use
in extensions.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
.../pg_stat_statements/pg_stat_statements.c | 173 ++++++++----------
1 file changed, 77 insertions(+), 96 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 5494d41dca1..f078b4fe71b 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -249,7 +249,7 @@ typedef struct pgssEntry
*/
typedef struct pgssSharedState
{
- LWLock *lock; /* protects hashtable search/modification */
+ LWLockPadded lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
Size mean_query_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
@@ -259,14 +259,24 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+static pgssSharedState *pgss;
+static HTAB *pgss_hash;
+
+static void pgss_shmem_request(void *arg);
+static void pgss_shmem_init(void *arg);
+
+static const ShmemCallbacks pgss_shmem_callbacks = {
+ .request_fn = pgss_shmem_request,
+ .init_fn = pgss_shmem_init,
+};
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
/* Saved hook values */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;
static planner_hook_type prev_planner_hook = NULL;
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
@@ -275,10 +285,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -331,8 +337,6 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_1_13);
PG_FUNCTION_INFO_V1(pg_stat_statements);
PG_FUNCTION_INFO_V1(pg_stat_statements_info);
-static void pgss_shmem_request(void);
-static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
static void pgss_post_parse_analyze(ParseState *pstate, Query *query,
JumbleState *jstate);
@@ -366,7 +370,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -471,13 +474,14 @@ _PG_init(void)
MarkGUCPrefixReserved("pg_stat_statements");
+ /*
+ * Register our shared memory needs.
+ */
+ RegisterShmemCallbacks(&pgss_shmem_callbacks);
+
/*
* Install hooks.
*/
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = pgss_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = pgss_shmem_startup;
prev_post_parse_analyze_hook = post_parse_analyze_hook;
post_parse_analyze_hook = pgss_post_parse_analyze;
prev_planner_hook = planner_hook;
@@ -495,30 +499,42 @@ _PG_init(void)
}
/*
- * shmem_request hook: request additional shared resources. We'll allocate or
- * attach to the shared resources in pgss_shmem_startup().
+ * shmem request callback: Request shared memory resources.
+ *
+ * This is called at postmaster startup. Note that the shared memory isn't
+ * allocated here yet, this merely register our needs.
+ *
+ * In EXEC_BACKEND mode, this is also called in each backend, to re-attach to
+ * the shared memory area that was already initialized.
*/
static void
-pgss_shmem_request(void)
+pgss_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(pgss_memsize());
- RequestNamedLWLockTranche("pg_stat_statements", 1);
+ ShmemRequestHash(.name = "pg_stat_statements hash",
+ .nelems = pgss_max,
+ .hash_info.keysize = sizeof(pgssHashKey),
+ .hash_info.entrysize = sizeof(pgssEntry),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ .ptr = &pgss_hash,
+ );
+ ShmemRequestStruct(.name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .ptr = (void **) &pgss,
+ );
}
/*
- * shmem_startup hook: allocate or attach to shared memory,
- * then load any pre-existing statistics from file.
- * Also create and load the query-texts file, which is expected to exist
- * (even if empty) while the module is enabled.
+ * shmem init callback: Initialize our shared memory data structures at
+ * postmaster startup.
+ *
+ * Load any pre-existing statistics from file. Also create and load the
+ * query-texts file, which is expected to exist (even if empty) while the
+ * module is enabled.
*/
static void
-pgss_shmem_startup(void)
+pgss_shmem_init(void *arg)
{
- bool found;
- HASHCTL info;
+ int tranche_id;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -528,59 +544,38 @@ pgss_shmem_startup(void)
int buffer_size;
char *buffer = NULL;
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
/*
- * Create or attach to the shared memory state, including hash table
+ * We already checked that we're loaded from shared_preload_libraries in
+ * _PG_init(), so we should not get here after postmaster startup.
*/
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
+ Assert(!IsUnderPostmaster);
/*
- * If we're in the postmaster (or a standalone backend...), set up a shmem
- * exit hook to dump the statistics to disk.
+ * Initialize the shmem area with no statistics.
*/
- if (!IsUnderPostmaster)
- on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
+ tranche_id = LWLockNewTrancheId("pg_stat_statements");
+ LWLockInitialize(&pgss->lock.lock, tranche_id);
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
+
+ /* The hash table must've also been initialized by now */
+ Assert(pgss_hash != NULL);
/*
- * Done if some other process already completed our initialization.
+ * Set up a shmem exit hook to dump the statistics to disk on postmaster
+ * (or standalone backend) exit.
*/
- if (found)
- return;
+ on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
/*
+ * Load any pre-existing statistics from file.
+ *
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
*/
@@ -1339,7 +1334,7 @@ pgss_store(const char *query, int64 queryId,
key.toplevel = (nesting_level == 0);
/* Lookup the hash table entry with shared lock. */
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
@@ -1360,11 +1355,11 @@ pgss_store(const char *query, int64 queryId,
*/
if (jstate)
{
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
norm_query = generate_normalized_query(jstate, query,
query_location,
&query_len);
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
}
/* Append new query text to file with only shared lock held */
@@ -1379,8 +1374,8 @@ pgss_store(const char *query, int64 queryId,
do_gc = need_gc_qtexts();
/* Need exclusive lock to make a new hashtable entry - promote */
- LWLockRelease(pgss->lock);
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockRelease(&pgss->lock.lock);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
/*
* A garbage collection may have occurred while we weren't holding the
@@ -1519,7 +1514,7 @@ pgss_store(const char *query, int64 queryId,
}
done:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
/* We postpone this clean-up until we're out of the lock */
if (norm_query)
@@ -1808,7 +1803,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
* we need to partition the hash table to limit the time spent holding any
* one lock.
*/
- LWLockAcquire(pgss->lock, LW_SHARED);
+ LWLockAcquire(&pgss->lock.lock, LW_SHARED);
if (showtext)
{
@@ -2046,7 +2041,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
}
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
if (qbuffer)
pfree(qbuffer);
@@ -2086,20 +2081,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
@@ -2730,7 +2711,7 @@ entry_reset(Oid userid, Oid dbid, int64 queryid, bool minmax_only)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("pg_stat_statements must be loaded via \"shared_preload_libraries\"")));
- LWLockAcquire(pgss->lock, LW_EXCLUSIVE);
+ LWLockAcquire(&pgss->lock.lock, LW_EXCLUSIVE);
num_entries = hash_get_num_entries(pgss_hash);
stats_reset = GetCurrentTimestamp();
@@ -2824,7 +2805,7 @@ done:
record_gc_qtexts();
release_lock:
- LWLockRelease(pgss->lock);
+ LWLockRelease(&pgss->lock.lock);
return stats_reset;
}
--
2.47.3
[text/x-patch] v12-0005-Introduce-registry-of-built-in-subsystems.patch (8.3K, 6-v12-0005-Introduce-registry-of-built-in-subsystems.patch)
download | inline diff:
From 32531e697b1c653b9af9a4b98b0923894318ba0c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:52:28 +0300
Subject: [PATCH v12 05/13] Introduce registry of built-in subsystems
To add a new built-in subsystem, add it to subsystemslist.h. That
hooks up its shmem callbacks so that they get called at the right
times during postmaster startup. For now this is unused, but will
replace the current SubsystemShmemSize() and SubsystemShmemInit()
calls in the next commits.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/bootstrap/bootstrap.c | 2 ++
src/backend/postmaster/launch_backend.c | 2 ++
src/backend/postmaster/postmaster.c | 5 +++++
src/backend/storage/ipc/ipci.c | 21 +++++++++++++++++
src/backend/storage/ipc/shmem.c | 6 +++--
src/backend/tcop/postgres.c | 3 +++
src/include/storage/ipc.h | 1 +
src/include/storage/subsystemlist.h | 23 +++++++++++++++++++
src/include/storage/subsystems.h | 30 +++++++++++++++++++++++++
src/tools/pginclude/headerscheck | 1 +
10 files changed, 92 insertions(+), 2 deletions(-)
create mode 100644 src/include/storage/subsystemlist.h
create mode 100644 src/include/storage/subsystems.h
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 3766d8231ac..63378ab3d8c 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -363,6 +363,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
+ RegisterBuiltinShmemCallbacks();
+
InitializeMaxBackends();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 0973010b7dc..ed0f4f2d234 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -664,6 +664,8 @@ SubPostmasterMain(int argc, char *argv[])
*/
LocalProcessControlFile(false);
+ RegisterBuiltinShmemCallbacks();
+
/*
* Reload any libraries that were preloaded by the postmaster. Since we
* exec'd this process, those libraries didn't come along with us; but we
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7a8ee19bdaf..a2de96a9a8e 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -922,6 +922,11 @@ PostmasterMain(int argc, char *argv[])
*/
ApplyLauncherRegister();
+ /*
+ * Register the shared memory needs of all core subsystems.
+ */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 24422a80ab3..e4a6a52f12d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
#include "storage/sinvaladt.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
#include "utils/wait_event.h"
@@ -252,6 +253,26 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+/*
+ * Early initialization of various subsystems, giving them a chance to
+ * register their shared memory needs before the shared memory segment is
+ * allocated.
+ */
+void
+RegisterBuiltinShmemCallbacks(void)
+{
+ /*
+ * Call RegisterShmemCallbacks(...) on each subsystem listed in
+ * subsystemslist.h
+ */
+#define PG_SHMEM_SUBSYSTEM(subsystem_callbacks) \
+ RegisterShmemCallbacks(&(subsystem_callbacks));
+
+#include "storage/subsystemlist.h"
+
+#undef PG_SHMEM_SUBSYSTEM
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 5c7caf59360..601c618d86c 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -78,8 +78,10 @@
* });
* }
*
- * Register the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks)
- * in the extension's _PG_init() function.
+ * In builtin PostgreSQL code, add the callbacks to the list in
+ * src/include/storage/subsystemlist.h. In an add-in module, you can register
+ * the callbacks by calling RegisterShmemCallbacks(&MyShmemCallbacks) in the
+ * extension's _PG_init() function.
*
* Lifecycle
* ---------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 93851269e43..6a9ff3ad225 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4138,6 +4138,9 @@ PostgresSingleUserMain(int argc, char *argv[],
/* read control file (error checking and contains config ) */
LocalProcessControlFile(false);
+ /* Register the shared memory needs of all core subsystems. */
+ RegisterBuiltinShmemCallbacks();
+
/*
* process any libraries that should be preloaded at postmaster start
*/
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..b205b00e7a1 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterBuiltinShmemCallbacks(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
new file mode 100644
index 00000000000..ed43c90bcc3
--- /dev/null
+++ b/src/include/storage/subsystemlist.h
@@ -0,0 +1,23 @@
+/*---------------------------------------------------------------------------
+ * subsystemlist.h
+ *
+ * List of initialization callbacks of built-in subsystems. This is kept in
+ * its own source file for possible use by automatic tools.
+ * PG_SHMEM_SUBSYSTEM is defined in the callers depending on how the list is
+ * used.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystemlist.h
+ *---------------------------------------------------------------------------
+ */
+
+/* there is deliberately not an #ifndef SUBSYSTEMLIST_H here */
+
+/*
+ * Note: there are some inter-dependencies between these, so the order of some
+ * of these matter.
+ */
+
+/* TODO: empty for now */
diff --git a/src/include/storage/subsystems.h b/src/include/storage/subsystems.h
new file mode 100644
index 00000000000..38b735bec67
--- /dev/null
+++ b/src/include/storage/subsystems.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * subsystems.h
+ * Provide extern declarations for all the built-in subsystem callbacks
+ *
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/storage/subsystems.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SUBSYSTEMS_H
+#define SUBSYSTEMS_H
+
+#include "storage/shmem.h"
+
+/*
+ * Extern declarations of all the built-in subsystem callbacks
+ *
+ * The actual list is in subsystemlist.h, so that the same list can be used
+ * for other purposes.
+ */
+#define PG_SHMEM_SUBSYSTEM(callbacks) \
+ extern const ShmemCallbacks callbacks;
+#include "storage/subsystemlist.h"
+#undef PG_SHMEM_SUBSYSTEM
+
+#endif /* SUBSYSTEMS_H */
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 14c466cc237..24f7416185e 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -131,6 +131,7 @@ do
test "$f" = src/include/postmaster/proctypelist.h && continue
test "$f" = src/include/regex/regerrs.h && continue
test "$f" = src/include/storage/lwlocklist.h && continue
+ test "$f" = src/include/storage/subsystemlist.h && continue
test "$f" = src/include/tcop/cmdtaglist.h && continue
test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue
test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue
--
2.47.3
[text/x-patch] v12-0006-Convert-lwlock.c-to-use-the-new-shmem-allocation.patch (10.2K, 7-v12-0006-Convert-lwlock.c-to-use-the-new-shmem-allocation.patch)
download | inline diff:
From 239f445b05e1a014e533f51da571d6b2cbdc3279 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:52:48 +0300
Subject: [PATCH v12 06/13] Convert lwlock.c to use the new shmem allocation
functions
It seems like a good candidate to convert first because it needs to
initialized before any other subsystem, but other than that it's
nothing special.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/postmaster/postmaster.c | 16 +++--
src/backend/storage/ipc/ipci.c | 13 ----
src/backend/storage/lmgr/lwlock.c | 100 +++++++++++++---------------
src/backend/tcop/postgres.c | 16 +++--
src/include/storage/lwlock.h | 2 -
src/include/storage/subsystemlist.h | 9 ++-
6 files changed, 75 insertions(+), 81 deletions(-)
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a2de96a9a8e..6f13e8f40a0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -956,18 +956,22 @@ PostmasterMain(int argc, char *argv[])
*/
InitializeFastPathLocks();
- /*
- * Ask all subsystems, including preloaded libraries, to register their
- * shared memory needs.
- */
- ShmemCallRequestCallbacks();
-
/*
* Also call any legacy shmem request hooks that might've been installed
* by preloaded libraries.
+ *
+ * Note: this must be done before ShmemCallRequestCallbacks(), because the
+ * hooks may request LWLocks with RequestNamedLWLockTranche(), which in
+ * turn affects the size of the LWLock array calculated in lwlock.c.
*/
process_shmem_requests();
+ /*
+ * Ask all subsystems, including preloaded libraries, to register their
+ * shared memory needs.
+ */
+ ShmemCallRequestCallbacks();
+
/*
* Now that loadable modules have had their chance to request additional
* shared memory, determine the value of any runtime-computed GUCs that
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index e4a6a52f12d..de65a9ef33c 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -121,7 +121,6 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, LWLockShmemSize());
size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, SharedInvalShmemSize());
@@ -179,11 +178,6 @@ AttachSharedMemoryStructs(void)
*/
InitializeFastPathLocks();
- /*
- * Attach to LWLocks first. They are needed by most other subsystems.
- */
- LWLockShmemInit();
-
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
CreateOrAttachShmemStructs();
@@ -230,13 +224,6 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
- /*
- * Initialize LWLocks first, in case any of the shmem init function use
- * LWLocks. (Nothing else can be running during startup, so they don't
- * need to do any locking yet, but we nevertheless allow it.)
- */
- LWLockShmemInit();
-
/* Initialize all shmem areas */
ShmemInitRequested();
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 98138cb09d1..b1ad396ba79 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -84,6 +84,7 @@
#include "storage/proclist.h"
#include "storage/procnumber.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -189,9 +190,6 @@ typedef struct LWLockTrancheShmemData
int num_user_defined; /* 'user_defined' entries in use */
slock_t lock; /* protects the above */
-
- /* Size of MainLWLockArray */
- int num_main_array_locks;
} LWLockTrancheShmemData;
static LWLockTrancheShmemData *LWLockTranches;
@@ -212,7 +210,18 @@ typedef struct NamedLWLockTrancheRequest
static List *NamedLWLockTrancheRequests = NIL;
-static void InitializeLWLocks(int numLocks);
+/* Size of MainLWLockArray. Only valid in postmaster. */
+static int num_main_array_locks;
+
+static void LWLockShmemRequest(void *arg);
+static void LWLockShmemInit(void *arg);
+
+const ShmemCallbacks LWLockCallbacks = {
+ .request_fn = LWLockShmemRequest,
+ .init_fn = LWLockShmemInit,
+};
+
+
static inline void LWLockReportWaitStart(LWLock *lock);
static inline void LWLockReportWaitEnd(void);
static const char *GetLWTrancheName(uint16 trancheId);
@@ -401,68 +410,53 @@ NumLWLocksForNamedTranches(void)
}
/*
- * Compute shmem space needed for user-defined tranches and the main LWLock
- * array.
+ * Request shmem space for user-defined tranches and the main LWLock array.
*/
-Size
-LWLockShmemSize(void)
+static void
+LWLockShmemRequest(void *arg)
{
- Size size;
- int numLocks;
+ size_t size;
/* Space for user-defined tranches */
- size = sizeof(LWLockTrancheShmemData);
+ ShmemRequestStruct(.name = "LWLock tranches",
+ .size = sizeof(LWLockTrancheShmemData),
+ .ptr = (void **) &LWLockTranches,
+ );
/* Space for the LWLock array */
- numLocks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
- size = add_size(size, mul_size(numLocks, sizeof(LWLockPadded)));
-
- return size;
-}
-
-/*
- * Allocate shmem space for user-defined tranches and the main LWLock array,
- * and initialize it.
- */
-void
-LWLockShmemInit(void)
-{
- int numLocks;
- bool found;
-
- LWLockTranches = (LWLockTrancheShmemData *)
- ShmemInitStruct("LWLock tranches", sizeof(LWLockTrancheShmemData), &found);
- if (!found)
+ if (!IsUnderPostmaster)
{
- /* Calculate total number of locks needed in the main array */
- LWLockTranches->num_main_array_locks =
- NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
-
- /* Initialize the dynamic-allocation counter for tranches */
- LWLockTranches->num_user_defined = 0;
-
- SpinLockInit(&LWLockTranches->lock);
+ num_main_array_locks = NUM_FIXED_LWLOCKS + NumLWLocksForNamedTranches();
+ size = num_main_array_locks * sizeof(LWLockPadded);
}
+ else
+ size = SHMEM_ATTACH_UNKNOWN_SIZE;
- /* Allocate and initialize the main array */
- numLocks = LWLockTranches->num_main_array_locks;
- MainLWLockArray = (LWLockPadded *)
- ShmemInitStruct("Main LWLock array", numLocks * sizeof(LWLockPadded), &found);
- if (!found)
- {
- /* Initialize all LWLocks */
- InitializeLWLocks(numLocks);
- }
+ ShmemRequestStruct(.name = "Main LWLock array",
+ .size = size,
+ .ptr = (void **) &MainLWLockArray,
+ );
}
/*
- * Initialize LWLocks for built-in tranches and those requested with
- * RequestNamedLWLockTranche().
+ * Initialize shmem space for user-defined tranches and the main LWLock array.
*/
static void
-InitializeLWLocks(int numLocks)
+LWLockShmemInit(void *arg)
{
- int pos = 0;
+ int pos;
+
+ /* Initialize the dynamic-allocation counter for tranches */
+ LWLockTranches->num_user_defined = 0;
+
+ SpinLockInit(&LWLockTranches->lock);
+
+ /*
+ * Allocate and initialize all LWLocks in the main array. It includes all
+ * LWLocks for built-in tranches and those requested with
+ * RequestNamedLWLockTranche().
+ */
+ pos = 0;
/* Initialize all individual LWLocks in main array */
for (int id = 0; id < NUM_INDIVIDUAL_LWLOCKS; id++)
@@ -501,8 +495,8 @@ InitializeLWLocks(int numLocks)
LWLockInitialize(&MainLWLockArray[pos++].lock, LWTRANCHE_FIRST_USER_DEFINED + idx);
}
- /* Cross-check that we agree on the total size with the caller */
- Assert(pos == numLocks);
+ /* Cross-check that we agree on the total size with LWLockShmemRequest() */
+ Assert(pos == num_main_array_locks);
}
/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6a9ff3ad225..95496654714 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4158,18 +4158,22 @@ PostgresSingleUserMain(int argc, char *argv[],
/* Initialize size of fast-path lock cache. */
InitializeFastPathLocks();
- /*
- * Before computing the total size needed, give all subsystems, including
- * add-ins, a chance to chance to adjust their requested shmem sizes.
- */
- ShmemCallRequestCallbacks();
-
/*
* Also call any legacy shmem request hooks that might'be been installed
* by preloaded libraries.
+ *
+ * Note: this must be done before ShmemCallRequestCallbacks(), because the
+ * hooks may request LWLocks with RequestNamedLWLockTranche(), which in
+ * turn affects the size of the LWLock array calculated in lwlock.c.
*/
process_shmem_requests();
+ /*
+ * Before computing the total size needed, give all subsystems, including
+ * add-ins, a chance to chance to adjust their requested shmem sizes.
+ */
+ ShmemCallRequestCallbacks();
+
/*
* Now that loadable modules have had their chance to request additional
* shared memory, determine the value of any runtime-computed GUCs that
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 61f0dbe749a..efa5b427e9f 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -126,8 +126,6 @@ extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
-extern Size LWLockShmemSize(void);
-extern void LWLockShmemInit(void);
extern void InitLWLockAccess(void);
extern const char *GetLWLockIdentifier(uint32 classId, uint16 eventId);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index ed43c90bcc3..f0cf01f5a85 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -20,4 +20,11 @@
* of these matter.
*/
-/* TODO: empty for now */
+/*
+ * LWLocks first, in case any of the other shmem init functions use LWLocks.
+ * (Nothing else can be running during startup, so they don't need to do any
+ * locking yet, but we nevertheless allow it.)
+ */
+PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
+
+/* TODO: nothing else for now */
--
2.47.3
[text/x-patch] v12-0007-Use-the-new-shmem-allocation-function-in-a-few-c.patch (46.3K, 8-v12-0007-Use-the-new-shmem-allocation-function-in-a-few-c.patch)
download | inline diff:
From 5578f2a37b3fe399bc60488c1df9f6264626abe8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:53:11 +0300
Subject: [PATCH v12 07/13] Use the new shmem allocation function in a few core
subsystems
These subsystems have some complicating properties, making them
slightly harder to convert than most:
- The initialization callbacks of some of these subsystems have
dependencies, i.e. they need to be initialized in the right order.
- The ProcGlobal pointer still needs to be inherited by the
BackendParameters mechanism on EXEC_BACKEND builds, because
ProcGlobal is required by InitProcess() to get a PGPROC entry, and
the PGPROC entry is required to use LWLocks, and usually attaching
to shared memory areas requires the use of LWLocks.
- Similarly, ProcSignal pointer still needs to be handled by
BackendParameters, because query cancellation connections access it
without calling InitProcess
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/twophase.c | 2 +-
src/backend/access/transam/varsup.c | 35 ++---
src/backend/port/posix_sema.c | 22 ++-
src/backend/port/sysv_sema.c | 21 ++-
src/backend/port/win32_sema.c | 11 +-
src/backend/storage/ipc/dsm.c | 64 +++++----
src/backend/storage/ipc/dsm_registry.c | 36 ++---
src/backend/storage/ipc/ipci.c | 28 ----
src/backend/storage/ipc/latch.c | 8 +-
src/backend/storage/ipc/pmsignal.c | 51 ++++---
src/backend/storage/ipc/procarray.c | 110 +++++++-------
src/backend/storage/ipc/procsignal.c | 64 ++++-----
src/backend/storage/ipc/sinvaladt.c | 38 ++---
src/backend/storage/lmgr/proc.c | 191 +++++++++++++------------
src/backend/utils/hash/dynahash.c | 3 +-
src/include/access/transam.h | 2 -
src/include/storage/dsm.h | 3 -
src/include/storage/dsm_registry.h | 2 -
src/include/storage/pg_sema.h | 6 +-
src/include/storage/pmsignal.h | 2 -
src/include/storage/proc.h | 2 -
src/include/storage/procarray.h | 2 -
src/include/storage/procsignal.h | 3 -
src/include/storage/sinvaladt.h | 2 -
src/include/storage/subsystemlist.h | 17 ++-
25 files changed, 344 insertions(+), 381 deletions(-)
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index d468c9774b3..ab1cbd67bac 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -282,7 +282,7 @@ TwoPhaseShmemInit(void)
gxacts[i].next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by InitProcGlobal */
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
}
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 1441a051773..dc5e32d86f3 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -23,6 +23,7 @@
#include "postmaster/autovacuum.h"
#include "storage/pmsignal.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -30,35 +31,25 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
+static void VarsupShmemRequest(void *arg);
+
/* pointer to variables struct in shared memory */
TransamVariablesData *TransamVariables = NULL;
+const ShmemCallbacks VarsupShmemCallbacks = {
+ .request_fn = VarsupShmemRequest,
+};
/*
- * Initialization of shared memory for TransamVariables.
+ * Request shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
-{
- return sizeof(TransamVariablesData);
-}
-
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemRequest(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .ptr = (void **) &TransamVariables,
+ );
}
/*
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 40205b7d400..53e4a7a5c38 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -159,22 +159,24 @@ PosixSemaphoreKill(sem_t *sem)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
#ifdef USE_NAMED_POSIX_SEMAPHORES
/* No shared memory needed in this case */
- return 0;
#else
/* Need a PGSemaphoreData per semaphore */
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
#endif
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -193,10 +195,9 @@ PGSemaphoreShmemSize(int maxSemas)
* we don't have to expose the counters to other processes.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -214,11 +215,6 @@ PGReserveSemaphores(int maxSemas)
mySemPointers = (sem_t **) malloc(maxSemas * sizeof(sem_t *));
if (mySemPointers == NULL)
elog(PANIC, "out of memory");
-#else
-
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
#endif
numSems = 0;
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 4b2bf84072f..98d99515043 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -301,16 +301,20 @@ IpcSemaphoreCreate(int numSems)
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
- return mul_size(maxSemas, sizeof(PGSemaphoreData));
+ /* Need a PGSemaphoreData per semaphore */
+ ShmemRequestStruct(.name = "Semaphores",
+ .size = mul_size(maxSemas, sizeof(PGSemaphoreData)),
+ .ptr = (void **) &sharedSemas,
+ );
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* This is called during postmaster start or shared memory reinitialization.
* It should do whatever is needed to be able to support up to maxSemas
@@ -327,10 +331,9 @@ PGSemaphoreShmemSize(int maxSemas)
* have clobbered.)
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
struct stat statbuf;
- bool found;
/*
* We use the data directory's inode number to seed the search for free
@@ -344,10 +347,6 @@ PGReserveSemaphores(int maxSemas)
errmsg("could not stat data directory \"%s\": %m",
DataDir)));
- sharedSemas = (PGSemaphore)
- ShmemInitStruct("Semaphores", PGSemaphoreShmemSize(maxSemas), &found);
- Assert(!found);
-
numSharedSemas = 0;
maxSharedSemas = maxSemas;
diff --git a/src/backend/port/win32_sema.c b/src/backend/port/win32_sema.c
index ba97c9b2d64..a3202554769 100644
--- a/src/backend/port/win32_sema.c
+++ b/src/backend/port/win32_sema.c
@@ -25,17 +25,16 @@ static void ReleaseSemaphores(int code, Datum arg);
/*
- * Report amount of shared memory needed for semaphores
+ * Request shared memory needed for semaphores
*/
-Size
-PGSemaphoreShmemSize(int maxSemas)
+void
+PGSemaphoreShmemRequest(int maxSemas)
{
/* No shared memory needed on Windows */
- return 0;
}
/*
- * PGReserveSemaphores --- initialize semaphore support
+ * PGSemaphoreInit --- initialize semaphore support
*
* In the Win32 implementation, we acquire semaphores on-demand; the
* maxSemas parameter is just used to size the array that keeps track of
@@ -44,7 +43,7 @@ PGSemaphoreShmemSize(int maxSemas)
* process exits.
*/
void
-PGReserveSemaphores(int maxSemas)
+PGSemaphoreInit(int maxSemas)
{
mySemSet = (HANDLE *) malloc(maxSemas * sizeof(HANDLE));
if (mySemSet == NULL)
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..8b69df4ff26 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -43,6 +43,7 @@
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/freepage.h"
#include "utils/memutils.h"
#include "utils/resowner.h"
@@ -109,6 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void *dsm_main_space_begin = NULL;
+static size_t dsm_main_space_size;
+
+static void dsm_main_space_request(void *arg);
+static void dsm_main_space_init(void *arg);
+
+const ShmemCallbacks dsm_shmem_callbacks = {
+ .request_fn = dsm_main_space_request,
+ .init_fn = dsm_main_space_init,
+};
/*
* List of dynamic shared memory segments used by this backend.
@@ -464,42 +474,40 @@ dsm_set_control_handle(dsm_handle h)
#endif
/*
- * Reserve some space in the main shared memory segment for DSM segments.
+ * Reserve space in the main shared memory segment for DSM segments.
*/
-size_t
-dsm_estimate_size(void)
+static void
+dsm_main_space_request(void *arg)
{
- return 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+ dsm_main_space_size = 1024 * 1024 * (size_t) min_dynamic_shared_memory;
+
+ if (dsm_main_space_size == 0)
+ return;
+
+ ShmemRequestStruct(.name = "Preallocated DSM",
+ .size = dsm_main_space_size,
+ .ptr = &dsm_main_space_begin,
+ );
}
-/*
- * Initialize space in the main shared memory segment for DSM segments.
- */
-void
-dsm_shmem_init(void)
+static void
+dsm_main_space_init(void *arg)
{
- size_t size = dsm_estimate_size();
- bool found;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
- if (size == 0)
+ if (dsm_main_space_size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (dsm_main_space_size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 9bfcd616827..2b56977659b 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -45,6 +45,7 @@
#include "storage/dsm_registry.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/tuplestore.h"
@@ -57,6 +58,14 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryShmemRequest(void *arg);
+static void DSMRegistryShmemInit(void *arg);
+
+const ShmemCallbacks DSMRegistryShmemCallbacks = {
+ .request_fn = DSMRegistryShmemRequest,
+ .init_fn = DSMRegistryShmemInit,
+};
+
typedef struct NamedDSMState
{
dsm_handle handle;
@@ -114,27 +123,20 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+static void
+DSMRegistryShmemRequest(void *arg)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRequestStruct(.name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .ptr = (void **) &DSMRegistryCtx,
+ );
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryShmemInit(void *arg)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index de65a9ef33c..4f707158303 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -20,7 +20,6 @@
#include "access/nbtree.h"
#include "access/subtrans.h"
#include "access/syncscan.h"
-#include "access/transam.h"
#include "access/twophase.h"
#include "access/xlogprefetcher.h"
#include "access/xlogrecovery.h"
@@ -42,16 +41,11 @@
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
#include "storage/dsm.h"
-#include "storage/dsm_registry.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
-#include "storage/pmsignal.h"
#include "storage/predicate.h"
#include "storage/proc.h"
-#include "storage/procarray.h"
-#include "storage/procsignal.h"
#include "storage/shmem_internal.h"
-#include "storage/sinvaladt.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/injection_point.h"
@@ -105,14 +99,10 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -121,11 +111,7 @@ CalculateShmemSize(void)
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -278,13 +264,9 @@ RegisterBuiltinShmemCallbacks(void)
static void
CreateOrAttachShmemStructs(void)
{
- dsm_shmem_init();
- DSMRegistryShmemInit();
-
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
@@ -307,23 +289,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 8537e9fef2d..7d4f4cf32bb 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -80,10 +80,10 @@ InitLatch(Latch *latch)
* current process.
*
* InitSharedLatch needs to be called in postmaster before forking child
- * processes, usually right after allocating the shared memory block
- * containing the latch with ShmemInitStruct. (The Unix implementation
- * doesn't actually require that, but the Windows one does.) Because of
- * this restriction, we have no concurrency issues to worry about here.
+ * processes, usually right after initializing the shared memory block
+ * containing the latch. (The Unix implementation doesn't actually require
+ * that, but the Windows one does.) Because of this restriction, we have no
+ * concurrency issues to worry about here.
*
* Note that other handles created in this module are never marked as
* inheritable. Thus we do not need to worry about cleaning up child
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..bdad5fdd043 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -27,6 +27,7 @@
#include "storage/ipc.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
@@ -83,6 +84,14 @@ struct PMSignalData
/* PMSignalState pointer is valid in both postmaster and child processes */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static void PMSignalShmemRequest(void *);
+static void PMSignalShmemInit(void *);
+
+const ShmemCallbacks PMSignalShmemCallbacks = {
+ .request_fn = PMSignalShmemRequest,
+ .init_fn = PMSignalShmemInit,
+};
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +132,29 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRequest - Register pmsignal.c's shared memory needs
*/
-Size
-PMSignalShmemSize(void)
+static void
+PMSignalShmemRequest(void *arg)
{
- Size size;
+ size_t size;
- size = offsetof(PMSignalData, PMChildFlags);
- size = add_size(size, mul_size(MaxLivePostmasterChildren(),
- sizeof(sig_atomic_t)));
+ num_child_flags = MaxLivePostmasterChildren();
- return size;
+ size = add_size(offsetof(PMSignalData, PMChildFlags),
+ mul_size(num_child_flags, sizeof(sig_atomic_t)));
+ ShmemRequestStruct(.name = "PMSignalState",
+ .size = size,
+ .ptr = (void **) &PMSignalState,
+ );
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ Assert(PMSignalState);
+ Assert(num_child_flags > 0);
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index cc207cb56e3..f540bb6b23f 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -61,6 +61,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
@@ -103,6 +104,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemRequest(void *arg);
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ProcArrayStruct *procArray;
+
+const struct ShmemCallbacks ProcArrayShmemCallbacks = {
+ .request_fn = ProcArrayShmemRequest,
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -269,9 +282,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -282,8 +292,11 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
+
static TransactionId *KnownAssignedXids;
+
static bool *KnownAssignedXidsValid;
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -374,19 +387,13 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+static void
+ProcArrayShmemRequest(void *arg)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
-
/*
* During Hot Standby processing we have a data structure called
* KnownAssignedXids, created in shared memory. Local data structures are
@@ -405,64 +412,49 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ ShmemRequestStruct(.name = "KnownAssignedXids",
+ .size = mul_size(sizeof(TransactionId), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXids,
+ );
+
+ ShmemRequestStruct(.name = "KnownAssignedXidsValid",
+ .size = mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
+ .ptr = (void **) &KnownAssignedXidsValid,
+ );
}
- return size;
+ /* Register the ProcArray shared structure */
+ ShmemRequestStruct(.name = "Proc Array",
+ .size = add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int), PROCARRAY_MAXPROCS)),
+ .ptr = (void **) &procArray,
+ );
}
/*
* Initialize the shared PGPROC array during postmaster startup.
*/
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index f1ab3aa3fe0..adebf0e7898 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -106,7 +107,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
+static void ProcSignalShmemRequest(void *arg);
+static void ProcSignalShmemInit(void *arg);
+
+const ShmemCallbacks ProcSignalShmemCallbacks = {
+ .request_fn = ProcSignalShmemRequest,
+ .init_fn = ProcSignalShmemInit,
+};
+
NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -114,51 +124,39 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRequest
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+static void
+ProcSignalShmemRequest(void *arg)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ShmemRequestStruct(.name = "ProcSignal",
+ .size = size,
+ .ptr = (void **) &ProcSignal,
+ );
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..37a21ffaf1a 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -25,6 +25,7 @@
#include "storage/shmem.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
/*
* Conceptually, the shared cache invalidation messages are stored in an
@@ -205,6 +206,14 @@ typedef struct SISeg
static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemRequest(void *arg);
+static void SharedInvalShmemInit(void *arg);
+
+const ShmemCallbacks SharedInvalShmemCallbacks = {
+ .request_fn = SharedInvalShmemRequest,
+ .init_fn = SharedInvalShmemInit,
+};
+
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRequest
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+static void
+SharedInvalShmemRequest(void *arg)
{
Size size;
@@ -223,26 +233,18 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ ShmemRequestStruct(.name = "shmInvalBuffer",
+ .size = size,
+ .ptr = (void **) &shmInvalBuffer,
+ );
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 9b880a6af65..a05c55b534e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -52,6 +52,7 @@
#include "storage/procsignal.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
@@ -70,9 +71,23 @@ PGPROC *MyProc = NULL;
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
+static void *AllProcsShmemPtr;
+static void *FastPathLockArrayShmemPtr;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static void ProcGlobalShmemRequest(void *arg);
+static void ProcGlobalShmemInit(void *arg);
+
+const ShmemCallbacks ProcGlobalShmemCallbacks = {
+ .request_fn = ProcGlobalShmemRequest,
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static uint32 TotalProcs;
+static size_t ProcGlobalAllProcsShmemSize;
+static size_t FastPathLockArrayShmemSize;
+
/* Is a deadlock check pending? */
static volatile sig_atomic_t got_deadlock_timeout;
@@ -83,32 +98,12 @@ static DeadLockState CheckDeadLock(void);
/*
- * Report shared-memory space needed by PGPROC.
+ * Calculate shared-memory space needed by Fast-Path locks.
*/
static Size
-PGProcShmemSize(void)
+CalculateFastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
-/*
- * Report shared-memory space needed by Fast-Path locks.
- */
-static Size
-FastPathLockShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -128,26 +123,7 @@ FastPathLockShmemSize(void)
}
/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
-/*
- * Report number of semaphores needed by InitProcGlobal.
+ * Report number of semaphores needed by ProcGlobalShmemInit.
*/
int
ProcGlobalSemas(void)
@@ -160,7 +136,67 @@ ProcGlobalSemas(void)
}
/*
- * InitProcGlobal -
+ * ProcGlobalShmemRequest
+ * Register shared memory needs.
+ *
+ * This is called during postmaster or standalone backend startup, and also
+ * during backend startup in EXEC_BACKEND mode.
+ */
+static void
+ProcGlobalShmemRequest(void *arg)
+{
+ Size size;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are six separate
+ * consumers: (1) normal backends, (2) autovacuum workers and special
+ * workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = 0;
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+ ProcGlobalAllProcsShmemSize = size;
+ ShmemRequestStruct(.name = "PGPROC structures",
+ .size = ProcGlobalAllProcsShmemSize,
+ .ptr = &AllProcsShmemPtr,
+ );
+
+ if (!IsUnderPostmaster)
+ size = FastPathLockArrayShmemSize = CalculateFastPathLockShmemSize();
+ else
+ size = SHMEM_ATTACH_UNKNOWN_SIZE;
+ ShmemRequestStruct(.name = "Fast-Path Lock Array",
+ .size = size,
+ .ptr = &FastPathLockArrayShmemPtr,
+ );
+
+ /*
+ * ProcGlobal is registered here in .ptr as usual, but it needs to be
+ * propagated specially in EXEC_BACKEND mode, because ProcGlobal needs to
+ * be accessed early at backend startup, before ShmemAttachRequested() has
+ * been called.
+ */
+ ShmemRequestStruct(.name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .ptr = (void **) &ProcGlobal,
+ );
+
+ /* Let the semaphore implementation register its shared memory needs */
+ PGSemaphoreShmemRequest(ProcGlobalSemas());
+}
+
+
+/*
+ * ProcGlobalShmemInit -
* Initialize the global process table during postmaster or standalone
* backend startup.
*
@@ -179,36 +215,23 @@ ProcGlobalSemas(void)
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
- *
- * Note: this is NOT called by individual backends under a postmaster,
- * not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
- * pointers must be propagated specially for EXEC_BACKEND operation.
*/
-void
-InitProcGlobal(void)
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
-
- /*
- * Initialize the data structures.
- */
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
SpinLockInit(&ProcGlobal->freeProcsLock);
dlist_init(&ProcGlobal->freeProcs);
@@ -221,23 +244,11 @@ InitProcGlobal(void)
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
+ ptr = AllProcsShmemPtr;
+ requestSize = ProcGlobalAllProcsShmemSize;
MemSet(ptr, 0, requestSize);
+ /* Carve out the allProcs array from the shared memory area */
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -246,7 +257,7 @@ InitProcGlobal(void)
ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
/*
- * Allocate arrays mirroring PGPROC fields in a dense manner. See
+ * Carve out arrays mirroring PGPROC fields in a dense manner. See
* PROC_HDR.
*
* XXX: It might make sense to increase padding for these arrays, given
@@ -261,30 +272,26 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ptr;
ptr = ptr + (TotalProcs * sizeof(*ProcGlobal->statusFlags));
- /* make sure wer didn't overflow */
+ /* make sure we didn't overflow */
Assert((ptr > (char *) procs) && (ptr <= (char *) procs + requestSize));
/*
- * Allocate arrays for fast-path locks. Those are variable-length, so
+ * Initialize arrays for fast-path locks. Those are variable-length, so
* can't be included in PGPROC directly. We allocate a separate piece of
* shared memory and then divide that between backends.
*/
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemPtr;
+ requestSize = FastPathLockArrayShmemSize;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
+ /* Initialize semaphores */
+ PGSemaphoreInit(ProcGlobalSemas());
for (i = 0; i < TotalProcs; i++)
{
@@ -405,7 +412,7 @@ InitProcess(void)
/*
* Decide which list should supply our PGPROC. This logic must match the
- * way the freelists were constructed in InitProcGlobal().
+ * way the freelists were constructed in ProcGlobalShmemInit().
*/
if (AmAutoVacuumWorkerProcess() || AmSpecialWorkerProcess())
procgloballist = &ProcGlobal->autovacFreeProcs;
@@ -460,7 +467,7 @@ InitProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
@@ -593,7 +600,7 @@ InitProcessPhase2(void)
* This is called by bgwriter and similar processes so that they will have a
* MyProc value that's real enough to let them wait for LWLocks. The PGPROC
* and sema that are assigned are one of the extra ones created during
- * InitProcGlobal.
+ * ProcGlobalShmemInit.
*
* Auxiliary processes are presently not expected to wait for real (lockmgr)
* locks, so we need not set up the deadlock checker. They are never added
@@ -662,7 +669,7 @@ InitAuxiliaryProcess(void)
/*
* Initialize all fields of MyProc, except for those previously
- * initialized by InitProcGlobal.
+ * initialized by ProcGlobalShmemInit.
*/
dlist_node_init(&MyProc->freeProcsLink);
MyProc->waitStatus = PROC_WAIT_STATUS_OK;
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index cb95ad4ef2a..20610f96e7b 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -338,7 +338,8 @@ string_compare(const char *key1, const char *key2, Size keysize)
* under info->hcxt rather than under TopMemoryContext; the default
* behavior is only suitable for session-lifespan hash tables.
* Other flags bits are special-purpose and seldom used, except for those
- * associated with shared-memory hash tables, for which see ShmemInitHash().
+ * associated with shared-memory hash tables, for which see
+ * ShmemRequestHash().
*
* Fields in *info are read only when the associated flags bit is set.
* It is not necessary to initialize other fields of *info.
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..55a4ab26b34 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -345,8 +345,6 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 407657df3ff..1bde71b4406 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -26,9 +26,6 @@ extern void dsm_postmaster_startup(PGShmemHeader *);
extern void dsm_backend_shutdown(void);
extern void dsm_detach_all(void);
-extern size_t dsm_estimate_size(void);
-extern void dsm_shmem_init(void);
-
#ifdef EXEC_BACKEND
extern void dsm_set_control_handle(dsm_handle h);
#endif
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..a2269c89f01 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,5 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/pg_sema.h b/src/include/storage/pg_sema.h
index 66facc6907a..fe50ee505ba 100644
--- a/src/include/storage/pg_sema.h
+++ b/src/include/storage/pg_sema.h
@@ -37,11 +37,11 @@ typedef HANDLE PGSemaphore;
#endif
-/* Report amount of shared memory needed */
-extern Size PGSemaphoreShmemSize(int maxSemas);
+/* Request shared memory needed for semaphores */
+extern void PGSemaphoreShmemRequest(int maxSemas);
/* Module initialization (called during postmaster start or shmem reinit) */
-extern void PGReserveSemaphores(int maxSemas);
+extern void PGSemaphoreInit(int maxSemas);
/* Allocate a PGSemaphore structure with initial count 1 */
extern PGSemaphore PGSemaphoreCreate(void);
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..001e6eea61c 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,6 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 22822fc68d7..3e1d1fad5f9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -552,8 +552,6 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
-extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
extern void InitAuxiliaryProcess(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index abdf021e66e..d718a5b542f 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -19,8 +19,6 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index cc4f26aa33d..7f855971b5a 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -67,9 +67,6 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
-
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index 122dbcdf19f..208ea9d051e 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -27,8 +27,6 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index f0cf01f5a85..d62c29f1361 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -27,4 +27,19 @@
*/
PG_SHMEM_SUBSYSTEM(LWLockCallbacks)
-/* TODO: nothing else for now */
+PG_SHMEM_SUBSYSTEM(dsm_shmem_callbacks)
+PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
+
+/* xlog, clog, and buffers */
+PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+
+/* process table */
+PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+
+/* shared-inval messaging */
+PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
+
+/* interprocess signaling mechanisms */
+PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
--
2.47.3
[text/x-patch] v12-0008-Refactor-shmem-initialization-code-in-predicate..patch (10.7K, 9-v12-0008-Refactor-shmem-initialization-code-in-predicate..patch)
download | inline diff:
From e6a8406bad9ea061e69f8642d6c20a1fff85c897 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 20:06:08 +0300
Subject: [PATCH v12 08/13] Refactor shmem initialization code in predicate.c
This is in preparation to convert it to use the new shmem allocation
functions, making the next commit to convert PredicateLockShmemInit()
smaller. This inlines the SerialInit() function to the caller, and
moves all the initialization steps within PredicateLockShmemInit() to
happen after all the ShmemInit{Struct|Hash}() calls.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/storage/lmgr/predicate.c | 217 ++++++++++++---------------
1 file changed, 98 insertions(+), 119 deletions(-)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index e003fa5b107..b509fbb2759 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -444,7 +444,6 @@ static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
static int serial_errdetail_for_io_error(const void *opaque_data);
-static void SerialInit(void);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
@@ -809,48 +808,6 @@ SerialPagePrecedesLogicallyUnitTests(void)
}
#endif
-/*
- * Initialize for the tracking of old serializable committed xids.
- */
-static void
-SerialInit(void)
-{
- bool found;
-
- /*
- * Set up SLRU management of the pg_serial data.
- */
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
-#ifdef USE_ASSERT_CHECKING
- SerialPagePrecedesLogicallyUnitTests();
-#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
-
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
-
- Assert(found == IsUnderPostmaster);
- if (!found)
- {
- /*
- * Set control information to reflect empty SLRU.
- */
- LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
- serialControl->headPage = -1;
- serialControl->headXid = InvalidTransactionId;
- serialControl->tailXid = InvalidTransactionId;
- LWLockRelease(SerialControlLock);
- }
-}
-
/*
* GUC check_hook for serializable_buffers
*/
@@ -1187,19 +1144,6 @@ PredicateLockShmemInit(void)
HASH_ELEM | HASH_BLOBS |
HASH_PARTITION | HASH_FIXED_SIZE);
- /*
- * Reserve a dummy entry in the hash table; we use it to make sure there's
- * always one entry available when we need to split or combine a page,
- * because running out of space there could mean aborting a
- * non-serializable transaction.
- */
- if (!IsUnderPostmaster)
- {
- (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
- HASH_ENTER, &found);
- Assert(!found);
- }
-
/* Pre-calculate the hash and partition lock of the scratch entry */
ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
@@ -1243,49 +1187,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, both the header and the element */
- memset(PredXact, 0, requestSize);
-
- dlist_init(&PredXact->availableList);
- dlist_init(&PredXact->activeList);
- PredXact->SxactGlobalXmin = InvalidTransactionId;
- PredXact->SxactGlobalXminCount = 0;
- PredXact->WritableSxactCount = 0;
- PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
- PredXact->CanPartialClearThrough = 0;
- PredXact->HavePartialClearedThrough = 0;
- PredXact->element
- = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_serializable_xacts; i++)
- {
- LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
- LWTRANCHE_PER_XACT_PREDICATE_LIST);
- dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
- }
- PredXact->OldCommittedSxact = CreatePredXact();
- SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
- PredXact->OldCommittedSxact->prepareSeqNo = 0;
- PredXact->OldCommittedSxact->commitSeqNo = 0;
- PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
- dlist_init(&PredXact->OldCommittedSxact->outConflicts);
- dlist_init(&PredXact->OldCommittedSxact->inConflicts);
- dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
- dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
- dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
- PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
- PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
- PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
- PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
- PredXact->OldCommittedSxact->pid = 0;
- PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- }
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
/*
* Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
@@ -1321,23 +1222,6 @@ PredicateLockShmemInit(void)
requestSize,
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- {
- int i;
-
- /* clean everything, including the elements */
- memset(RWConflictPool, 0, requestSize);
-
- dlist_init(&RWConflictPool->availableList);
- RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
- RWConflictPoolHeaderDataSize);
- /* Add all elements to available list, clean. */
- for (i = 0; i < max_rw_conflicts; i++)
- {
- dlist_push_tail(&RWConflictPool->availableList,
- &RWConflictPool->element[i].outLink);
- }
- }
/*
* Create or attach to the header for the list of finished serializable
@@ -1348,14 +1232,109 @@ PredicateLockShmemInit(void)
sizeof(dlist_head),
&found);
Assert(found == IsUnderPostmaster);
- if (!found)
- dlist_init(FinishedSerializableTransactions);
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialInit();
+ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
+ SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
+ SimpleLruInit(SerialSlruCtl, "serializable",
+ serializable_buffers, 0, "pg_serial",
+ LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
+ SYNC_HANDLER_NONE, false);
+#ifdef USE_ASSERT_CHECKING
+ SerialPagePrecedesLogicallyUnitTests();
+#endif
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
+
+ /*
+ * Create or attach to the SerialControl structure.
+ */
+ serialControl = (SerialControl)
+ ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
+ Assert(found == IsUnderPostmaster);
+
+ /*
+ * If we just attached to existing shared memory (EXEC_BACKEND), we're all
+ * done. Otherwise, during postmaster startup, proceed to initialize all
+ * the shared memory areas that we allocated.
+ */
+ if (IsUnderPostmaster)
+ {
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+ return;
+ }
+
+ /*
+ * Reserve a dummy entry in the hash table; we use it to make sure there's
+ * always one entry available when we need to split or combine a page,
+ * because running out of space there could mean aborting a
+ * non-serializable transaction.
+ */
+ (void) hash_search(PredicateLockTargetHash, &ScratchTargetTag,
+ HASH_ENTER, &found);
+ Assert(!found);
+
+ /* Initialize PredXact list */
+ dlist_init(&PredXact->availableList);
+ dlist_init(&PredXact->activeList);
+ PredXact->SxactGlobalXmin = InvalidTransactionId;
+ PredXact->SxactGlobalXminCount = 0;
+ PredXact->WritableSxactCount = 0;
+ PredXact->LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1;
+ PredXact->CanPartialClearThrough = 0;
+ PredXact->HavePartialClearedThrough = 0;
+ PredXact->element
+ = (SERIALIZABLEXACT *) ((char *) PredXact + PredXactListDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_serializable_xacts; i++)
+ {
+ LWLockInitialize(&PredXact->element[i].perXactPredicateListLock,
+ LWTRANCHE_PER_XACT_PREDICATE_LIST);
+ dlist_push_tail(&PredXact->availableList, &PredXact->element[i].xactLink);
+ }
+ PredXact->OldCommittedSxact = CreatePredXact();
+ SetInvalidVirtualTransactionId(PredXact->OldCommittedSxact->vxid);
+ PredXact->OldCommittedSxact->prepareSeqNo = 0;
+ PredXact->OldCommittedSxact->commitSeqNo = 0;
+ PredXact->OldCommittedSxact->SeqNo.lastCommitBeforeSnapshot = 0;
+ dlist_init(&PredXact->OldCommittedSxact->outConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->inConflicts);
+ dlist_init(&PredXact->OldCommittedSxact->predicateLocks);
+ dlist_node_init(&PredXact->OldCommittedSxact->finishedLink);
+ dlist_init(&PredXact->OldCommittedSxact->possibleUnsafeConflicts);
+ PredXact->OldCommittedSxact->topXid = InvalidTransactionId;
+ PredXact->OldCommittedSxact->finishedBefore = InvalidTransactionId;
+ PredXact->OldCommittedSxact->xmin = InvalidTransactionId;
+ PredXact->OldCommittedSxact->flags = SXACT_FLAG_COMMITTED;
+ PredXact->OldCommittedSxact->pid = 0;
+ PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
+
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
+
+ /* Initialize the rw-conflict pool */
+ dlist_init(&RWConflictPool->availableList);
+ RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
+ RWConflictPoolHeaderDataSize);
+ /* Add all elements to available list, clean. */
+ for (int i = 0; i < max_rw_conflicts; i++)
+ {
+ dlist_push_tail(&RWConflictPool->availableList,
+ &RWConflictPool->element[i].outLink);
+ }
+
+ /* Initialize the list of finished serializable transactions */
+ dlist_init(FinishedSerializableTransactions);
+
+ /* Initialize SerialControl to reflect empty SLRU. */
+ LWLockAcquire(SerialControlLock, LW_EXCLUSIVE);
+ serialControl->headPage = -1;
+ serialControl->headXid = InvalidTransactionId;
+ serialControl->tailXid = InvalidTransactionId;
+ LWLockRelease(SerialControlLock);
}
/*
--
2.47.3
[text/x-patch] v12-0009-Convert-SLRUs-to-use-the-new-shmem-allocation-fu.patch (85.1K, 10-v12-0009-Convert-SLRUs-to-use-the-new-shmem-allocation-fu.patch)
download | inline diff:
From cf9a7c705e0a7b4dec5eb906ddbfddc64781ccc8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:55:00 +0300
Subject: [PATCH v12 09/13] Convert SLRUs to use the new shmem allocation
functions
I replaced the old SimpleLruInit() function without a backwards
compatibility wrapper, because few extensions define their own SLRUs.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/transam/clog.c | 53 ++--
src/backend/access/transam/commit_ts.c | 85 +++---
src/backend/access/transam/multixact.c | 136 +++++----
src/backend/access/transam/slru.c | 366 ++++++++++++-----------
src/backend/access/transam/subtrans.c | 57 ++--
src/backend/commands/async.c | 115 ++++---
src/backend/storage/ipc/ipci.c | 16 -
src/backend/storage/ipc/shmem.c | 7 +
src/backend/storage/lmgr/predicate.c | 269 +++++++----------
src/backend/utils/activity/pgstat_slru.c | 1 +
src/include/access/clog.h | 2 -
src/include/access/commit_ts.h | 2 -
src/include/access/multixact.h | 2 -
src/include/access/slru.h | 112 ++++---
src/include/access/subtrans.h | 2 -
src/include/commands/async.h | 3 -
src/include/storage/predicate.h | 5 -
src/include/storage/shmem_internal.h | 1 +
src/include/storage/subsystemlist.h | 10 +
src/test/modules/test_slru/test_slru.c | 106 +++----
src/tools/pgindent/typedefs.list | 4 +-
21 files changed, 687 insertions(+), 667 deletions(-)
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index c654e0929b3..75012d4b8f0 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -43,6 +43,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "storage/proc.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/wait_event.h"
@@ -106,13 +107,21 @@ TransactionIdToPage(TransactionId xid)
/*
* Link to shared-memory data structures for CLOG control
*/
-static SlruCtlData XactCtlData;
+static void CLOGShmemRequest(void *arg);
+static void CLOGShmemInit(void *arg);
+static bool CLOGPagePrecedes(int64 page1, int64 page2);
+static int clog_errdetail_for_io_error(const void *opaque_data);
-#define XactCtl (&XactCtlData)
+const ShmemCallbacks CLOGShmemCallbacks = {
+ .request_fn = CLOGShmemRequest,
+ .init_fn = CLOGShmemInit,
+};
+
+static SlruDesc XactSlruDesc;
+
+#define XactCtl (&XactSlruDesc)
-static bool CLOGPagePrecedes(int64 page1, int64 page2);
-static int clog_errdetail_for_io_error(const void *opaque_data);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXact,
Oid oldestXactDb);
static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
@@ -775,16 +784,10 @@ CLOGShmemBuffers(void)
}
/*
- * Initialization of shared memory for CLOG
+ * Register shared memory for CLOG
*/
-Size
-CLOGShmemSize(void)
-{
- return SimpleLruShmemSize(CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE);
-}
-
-void
-CLOGShmemInit(void)
+static void
+CLOGShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (transaction_buffers == 0)
@@ -806,12 +809,26 @@ CLOGShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(transaction_buffers != 0);
+ SimpleLruRequest(.desc = &XactSlruDesc,
+ .name = "transaction",
+ .Dir = "pg_xact",
+ .long_segment_names = false,
+
+ .nslots = CLOGShmemBuffers(),
+ .nlsns = CLOG_LSNS_PER_PAGE,
+
+ .sync_handler = SYNC_HANDLER_CLOG,
+ .PagePrecedes = CLOGPagePrecedes,
+ .errdetail_for_io_error = clog_errdetail_for_io_error,
- XactCtl->PagePrecedes = CLOGPagePrecedes;
- XactCtl->errdetail_for_io_error = clog_errdetail_for_io_error;
- SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
- "pg_xact", LWTRANCHE_XACT_BUFFER,
- LWTRANCHE_XACT_SLRU, SYNC_HANDLER_CLOG, false);
+ .buffer_tranche_id = LWTRANCHE_XACT_BUFFER,
+ .bank_tranche_id = LWTRANCHE_XACT_SLRU,
+ );
+}
+
+static void
+CLOGShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
}
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 36219dd13cc..2625cbf93bf 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/timestamp.h"
@@ -80,9 +81,19 @@ TransactionIdToCTsPage(TransactionId xid)
/*
* Link to shared-memory data structures for CommitTs control
*/
-static SlruCtlData CommitTsCtlData;
+static void CommitTsShmemRequest(void *arg);
+static void CommitTsShmemInit(void *arg);
+static bool CommitTsPagePrecedes(int64 page1, int64 page2);
+static int commit_ts_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks CommitTsShmemCallbacks = {
+ .request_fn = CommitTsShmemRequest,
+ .init_fn = CommitTsShmemInit,
+};
+
+static SlruDesc CommitTsSlruDesc;
-#define CommitTsCtl (&CommitTsCtlData)
+#define CommitTsCtl (&CommitTsSlruDesc)
/*
* We keep a cache of the last value set in shared memory.
@@ -104,6 +115,7 @@ typedef struct CommitTimestampShared
static CommitTimestampShared *commitTsShared;
+static void CommitTsShmemInit(void *arg);
/* GUC variable */
bool track_commit_timestamp;
@@ -114,8 +126,6 @@ static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
ReplOriginId nodeid, int slotno);
static void error_commit_ts_disabled(void);
-static bool CommitTsPagePrecedes(int64 page1, int64 page2);
-static int commit_ts_errdetail_for_io_error(const void *opaque_data);
static void ActivateCommitTs(void);
static void DeactivateCommitTs(void);
static void WriteTruncateXlogRec(int64 pageno, TransactionId oldestXid);
@@ -512,24 +522,12 @@ CommitTsShmemBuffers(void)
}
/*
- * Shared memory sizing for CommitTs
+ * Register CommitTs shared memory needs at system startup (postmaster start
+ * or standalone backend)
*/
-Size
-CommitTsShmemSize(void)
-{
- return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
- sizeof(CommitTimestampShared);
-}
-
-/*
- * Initialize CommitTs at system startup (postmaster start or standalone
- * backend)
- */
-void
-CommitTsShmemInit(void)
+static void
+CommitTsShmemRequest(void *arg)
{
- bool found;
-
/* If auto-tuning is requested, now is the time to do it */
if (commit_timestamp_buffers == 0)
{
@@ -550,31 +548,36 @@ CommitTsShmemInit(void)
PGC_S_OVERRIDE);
}
Assert(commit_timestamp_buffers != 0);
+ SimpleLruRequest(.desc = &CommitTsSlruDesc,
+ .name = "commit_timestamp",
+ .Dir = "pg_commit_ts",
+ .long_segment_names = false,
- CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
- CommitTsCtl->errdetail_for_io_error = commit_ts_errdetail_for_io_error;
- SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
- "pg_commit_ts", LWTRANCHE_COMMITTS_BUFFER,
- LWTRANCHE_COMMITTS_SLRU,
- SYNC_HANDLER_COMMIT_TS,
- false);
- SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
+ .nslots = CommitTsShmemBuffers(),
- commitTsShared = ShmemInitStruct("CommitTs shared",
- sizeof(CommitTimestampShared),
- &found);
+ .PagePrecedes = CommitTsPagePrecedes,
+ .errdetail_for_io_error = commit_ts_errdetail_for_io_error,
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_COMMIT_TS,
+ .buffer_tranche_id = LWTRANCHE_COMMITTS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_COMMITTS_SLRU,
+ );
- commitTsShared->xidLastCommit = InvalidTransactionId;
- TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
- commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
- commitTsShared->commitTsActive = false;
- }
- else
- Assert(found);
+ ShmemRequestStruct(.name = "CommitTs shared",
+ .size = sizeof(CommitTimestampShared),
+ .ptr = (void **) &commitTsShared,
+ );
+}
+
+static void
+CommitTsShmemInit(void *arg)
+{
+ commitTsShared->xidLastCommit = InvalidTransactionId;
+ TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
+ commitTsShared->dataLastCommit.nodeid = InvalidReplOriginId;
+ commitTsShared->commitTsActive = false;
+
+ SlruPagePrecedesUnitTests(CommitTsCtl, COMMIT_TS_XACTS_PER_PAGE);
}
/*
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9f8d542c098..cb78ba0842d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -83,6 +83,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
@@ -113,11 +114,16 @@ PreviousMultiXactId(MultiXactId multi)
/*
* Links to shared-memory data structures for MultiXact control
*/
-static SlruCtlData MultiXactOffsetCtlData;
-static SlruCtlData MultiXactMemberCtlData;
+static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
+static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
+static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
+static int MultiXactMemberIoErrorDetail(const void *opaque_data);
+
+static SlruDesc MultiXactOffsetSlruDesc;
+static SlruDesc MultiXactMemberSlruDesc;
-#define MultiXactOffsetCtl (&MultiXactOffsetCtlData)
-#define MultiXactMemberCtl (&MultiXactMemberCtlData)
+#define MultiXactOffsetCtl (&MultiXactOffsetSlruDesc)
+#define MultiXactMemberCtl (&MultiXactMemberSlruDesc)
/*
* MultiXact state shared across all backends. All this state is protected
@@ -220,6 +226,15 @@ static MultiXactStateData *MultiXactState;
static MultiXactId *OldestMemberMXactId;
static MultiXactId *OldestVisibleMXactId;
+static void MultiXactShmemRequest(void *arg);
+static void MultiXactShmemInit(void *arg);
+static void MultiXactShmemAttach(void *arg);
+
+const ShmemCallbacks MultiXactShmemCallbacks = {
+ .request_fn = MultiXactShmemRequest,
+ .init_fn = MultiXactShmemInit,
+ .attach_fn = MultiXactShmemAttach,
+};
static inline MultiXactId *
MyOldestMemberMXactIdSlot(void)
@@ -321,10 +336,6 @@ typedef struct MultiXactMemberSlruReadContext
MultiXactOffset offset;
} MultiXactMemberSlruReadContext;
-static bool MultiXactOffsetPagePrecedes(int64 page1, int64 page2);
-static bool MultiXactMemberPagePrecedes(int64 page1, int64 page2);
-static int MultiXactOffsetIoErrorDetail(const void *opaque_data);
-static int MultiXactMemberIoErrorDetail(const void *opaque_data);
static void ExtendMultiXactOffset(MultiXactId multi);
static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
static void SetOldestOffset(void);
@@ -1747,83 +1758,80 @@ multixact_twophase_postabort(FullTransactionId fxid, uint16 info,
multixact_twophase_postcommit(fxid, info, recdata, len);
}
+
/*
- * Initialization of shared memory for MultiXact.
- *
- * MultiXactSharedStateShmemSize() calculates the size of the MultiXactState
- * struct, and the two per-backend MultiXactId arrays. They are carved out of
- * the same allocation. MultiXactShmemSize() additionally includes the memory
- * needed for the two SLRU areas.
+ * Register shared memory needs for MultiXact.
*/
-static Size
-MultiXactSharedStateShmemSize(void)
+static void
+MultiXactShmemRequest(void *arg)
{
Size size;
+ /*
+ * Calculate the size of the MultiXactState struct, and the two
+ * per-backend MultiXactId arrays. They are carved out of the same
+ * allocation.
+ */
size = offsetof(MultiXactStateData, perBackendXactIds);
size = add_size(size,
mul_size(sizeof(MultiXactId), NumMemberSlots));
size = add_size(size,
mul_size(sizeof(MultiXactId), NumVisibleSlots));
- return size;
-}
+ ShmemRequestStruct(.name = "Shared MultiXact State",
+ .size = size,
+ .ptr = (void **) &MultiXactState,
+ );
-Size
-MultiXactShmemSize(void)
-{
- Size size;
+ SimpleLruRequest(.desc = &MultiXactOffsetSlruDesc,
+ .name = "multixact_offset",
+ .Dir = "pg_multixact/offsets",
+ .long_segment_names = false,
- size = MultiXactSharedStateShmemSize();
- size = add_size(size, SimpleLruShmemSize(multixact_offset_buffers, 0));
- size = add_size(size, SimpleLruShmemSize(multixact_member_buffers, 0));
+ .nslots = multixact_offset_buffers,
- return size;
-}
+ .sync_handler = SYNC_HANDLER_MULTIXACT_OFFSET,
+ .PagePrecedes = MultiXactOffsetPagePrecedes,
+ .errdetail_for_io_error = MultiXactOffsetIoErrorDetail,
-void
-MultiXactShmemInit(void)
-{
- bool found;
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTOFFSET_SLRU,
+ );
- debug_elog2(DEBUG2, "Shared Memory Init for MultiXact");
+ SimpleLruRequest(.desc = &MultiXactMemberSlruDesc,
+ .name = "multixact_member",
+ .Dir = "pg_multixact/members",
+ .long_segment_names = true,
- MultiXactOffsetCtl->PagePrecedes = MultiXactOffsetPagePrecedes;
- MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
- MultiXactOffsetCtl->errdetail_for_io_error = MultiXactOffsetIoErrorDetail;
- MultiXactMemberCtl->errdetail_for_io_error = MultiXactMemberIoErrorDetail;
+ .nslots = multixact_member_buffers,
- SimpleLruInit(MultiXactOffsetCtl,
- "multixact_offset", multixact_offset_buffers, 0,
- "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER,
- LWTRANCHE_MULTIXACTOFFSET_SLRU,
- SYNC_HANDLER_MULTIXACT_OFFSET,
- false);
- SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
- SimpleLruInit(MultiXactMemberCtl,
- "multixact_member", multixact_member_buffers, 0,
- "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER,
- LWTRANCHE_MULTIXACTMEMBER_SLRU,
- SYNC_HANDLER_MULTIXACT_MEMBER,
- true);
- /* doesn't call SimpleLruTruncate() or meet criteria for unit tests */
-
- /* Initialize our shared state struct */
- MultiXactState = ShmemInitStruct("Shared MultiXact State",
- MultiXactSharedStateShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
+ .sync_handler = SYNC_HANDLER_MULTIXACT_MEMBER,
+ .PagePrecedes = MultiXactMemberPagePrecedes,
+ .errdetail_for_io_error = MultiXactMemberIoErrorDetail,
- /* Make sure we zero out the per-backend state */
- MemSet(MultiXactState, 0, MultiXactSharedStateShmemSize());
- }
- else
- Assert(found);
+ .buffer_tranche_id = LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+ .bank_tranche_id = LWTRANCHE_MULTIXACTMEMBER_SLRU,
+ );
+}
+
+static void
+MultiXactShmemInit(void *arg)
+{
+ SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
/*
- * Set up array pointers.
+ * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
+ * tests
*/
+
+ /* Set up array pointers */
+ OldestMemberMXactId = MultiXactState->perBackendXactIds;
+ OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
+}
+
+static void
+MultiXactShmemAttach(void *arg)
+{
+ /* Set up array pointers */
OldestMemberMXactId = MultiXactState->perBackendXactIds;
OldestVisibleMXactId = OldestMemberMXactId + NumMemberSlots;
}
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index a2bb8fa8033..47dd52d6749 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -70,7 +70,9 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/shmem.h"
+#include "storage/shmem_internal.h"
#include "utils/guc.h"
+#include "utils/memutils.h"
#include "utils/wait_event.h"
/*
@@ -89,9 +91,9 @@
* dir/123456 for [2^20, 2^24-1]
*/
static inline int
-SlruFileName(SlruCtl ctl, char *path, int64 segno)
+SlruFileName(SlruDesc *ctl, char *path, int64 segno)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
{
/*
* We could use 16 characters here but the disadvantage would be that
@@ -101,7 +103,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* that in the future we can't decrease SLRU_PAGES_PER_SEGMENT easily.
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFFFFFFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->Dir, segno);
+ return snprintf(path, MAXPGPATH, "%s/%015" PRIX64, ctl->options.Dir, segno);
}
else
{
@@ -110,7 +112,7 @@ SlruFileName(SlruCtl ctl, char *path, int64 segno)
* integers are allowed. See SlruCorrectSegmentFilenameLength()
*/
Assert(segno >= 0 && segno <= INT64CONST(0xFFFFFF));
- return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir,
+ return snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->options.Dir,
(unsigned int) segno);
}
}
@@ -176,19 +178,19 @@ static SlruErrorCause slru_errcause;
static int slru_errno;
-static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
-static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
-static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
-static bool SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno);
-static bool SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno,
+static void SimpleLruZeroLSNs(SlruDesc *ctl, int slotno);
+static void SimpleLruWaitIO(SlruDesc *ctl, int slotno);
+static void SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata);
+static bool SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno);
+static bool SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno,
SlruWriteAll fdata);
-static void SlruReportIOError(SlruCtl ctl, int64 pageno,
+static void SlruReportIOError(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-static int SlruSelectLRUPage(SlruCtl ctl, int64 pageno);
+static int SlruSelectLRUPage(SlruDesc *ctl, int64 pageno);
-static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
+static bool SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-static void SlruInternalDeleteSegment(SlruCtl ctl, int64 segno);
+static void SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno);
static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
@@ -196,7 +198,7 @@ static inline void SlruRecentlyUsed(SlruShared shared, int slotno);
* Initialization of shared memory
*/
-Size
+static Size
SimpleLruShmemSize(int nslots, int nlsns)
{
int nbanks = nslots / SLRU_BANK_SIZE;
@@ -238,120 +240,135 @@ SimpleLruAutotuneBuffers(int divisor, int max)
}
/*
- * Initialize, or attach to, a simple LRU cache in shared memory.
- *
- * ctl: address of local (unshared) control structure.
- * name: name of SLRU. (This is user-visible, pick with care!)
- * nslots: number of page slots to use.
- * nlsns: number of LSN groups per page (set to zero if not relevant).
- * subdir: PGDATA-relative subdirectory that will contain the files.
- * buffer_tranche_id: tranche ID to use for the SLRU's per-buffer LWLocks.
- * bank_tranche_id: tranche ID to use for the bank LWLocks.
- * sync_handler: which set of functions to use to handle sync requests
- * long_segment_names: use short or long segment names
+ * Register a simple LRU cache in shared memory.
*/
void
-SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id, int bank_tranche_id,
- SyncRequestHandler sync_handler, bool long_segment_names)
+SimpleLruRequestWithOpts(const SlruOpts *options)
{
+ SlruOpts *options_copy;
+
+ Assert(options->name != NULL);
+ Assert(options->nslots > 0);
+ Assert(options->PagePrecedes != NULL);
+ Assert(options->errdetail_for_io_error != NULL);
+
+ options_copy = MemoryContextAlloc(TopMemoryContext,
+ sizeof(SlruOpts));
+ memcpy(options_copy, options, sizeof(SlruOpts));
+
+ options_copy->base.name = options->name;
+ options_copy->base.size = SimpleLruShmemSize(options_copy->nslots, options_copy->nlsns);
+
+ ShmemRequestInternal(&options_copy->base, SHMEM_KIND_SLRU);
+}
+
+/* Initialize locks and shared memory area */
+void
+shmem_slru_init(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ char namebuf[NAMEDATALEN];
SlruShared shared;
- bool found;
+ int nslots = options->nslots;
int nbanks = nslots / SLRU_BANK_SIZE;
+ int nlsns = options->nlsns;
+ char *ptr;
+ Size offset;
+
+ shared = (SlruShared) location;
+ desc->shared = shared;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
+
+ /* assign new tranche IDs, if not given */
+ if (desc->options.buffer_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s buffer", desc->options.name);
+ desc->options.buffer_tranche_id = LWLockNewTrancheId(namebuf);
+ }
+ if (desc->options.bank_tranche_id == 0)
+ {
+ snprintf(namebuf, sizeof(namebuf), "%s bank", desc->options.name);
+ desc->options.bank_tranche_id = LWLockNewTrancheId(namebuf);
+ }
Assert(nslots <= SLRU_MAX_ALLOWED_BUFFERS);
- Assert(ctl->PagePrecedes != NULL);
- Assert(ctl->errdetail_for_io_error != NULL);
+ memset(shared, 0, sizeof(SlruSharedData));
- shared = (SlruShared) ShmemInitStruct(name,
- SimpleLruShmemSize(nslots, nlsns),
- &found);
+ shared->num_slots = nslots;
+ shared->lsn_groups_per_page = nlsns;
- if (!IsUnderPostmaster)
- {
- /* Initialize locks and shared memory area */
- char *ptr;
- Size offset;
-
- Assert(!found);
-
- memset(shared, 0, sizeof(SlruSharedData));
-
- shared->num_slots = nslots;
- shared->lsn_groups_per_page = nlsns;
-
- pg_atomic_init_u64(&shared->latest_page_number, 0);
-
- shared->slru_stats_idx = pgstat_get_slru_index(name);
-
- ptr = (char *) shared;
- offset = MAXALIGN(sizeof(SlruSharedData));
- shared->page_buffer = (char **) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(char *));
- shared->page_status = (SlruPageStatus *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
- shared->page_dirty = (bool *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(bool));
- shared->page_number = (int64 *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int64));
- shared->page_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(int));
-
- /* Initialize LWLocks */
- shared->buffer_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nslots * sizeof(LWLockPadded));
- shared->bank_locks = (LWLockPadded *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
- shared->bank_cur_lru_count = (int *) (ptr + offset);
- offset += MAXALIGN(nbanks * sizeof(int));
-
- if (nlsns > 0)
- {
- shared->group_lsn = (XLogRecPtr *) (ptr + offset);
- offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
- }
+ pg_atomic_init_u64(&shared->latest_page_number, 0);
- ptr += BUFFERALIGN(offset);
- for (int slotno = 0; slotno < nslots; slotno++)
- {
- LWLockInitialize(&shared->buffer_locks[slotno].lock,
- buffer_tranche_id);
+ shared->slru_stats_idx = pgstat_get_slru_index(desc->options.name);
- shared->page_buffer[slotno] = ptr;
- shared->page_status[slotno] = SLRU_PAGE_EMPTY;
- shared->page_dirty[slotno] = false;
- shared->page_lru_count[slotno] = 0;
- ptr += BLCKSZ;
- }
+ ptr = (char *) shared;
+ offset = MAXALIGN(sizeof(SlruSharedData));
+ shared->page_buffer = (char **) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(char *));
+ shared->page_status = (SlruPageStatus *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(SlruPageStatus));
+ shared->page_dirty = (bool *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(bool));
+ shared->page_number = (int64 *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int64));
+ shared->page_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(int));
- /* Initialize the slot banks. */
- for (int bankno = 0; bankno < nbanks; bankno++)
- {
- LWLockInitialize(&shared->bank_locks[bankno].lock, bank_tranche_id);
- shared->bank_cur_lru_count[bankno] = 0;
- }
+ /* Initialize LWLocks */
+ shared->buffer_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nslots * sizeof(LWLockPadded));
+ shared->bank_locks = (LWLockPadded *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(LWLockPadded));
+ shared->bank_cur_lru_count = (int *) (ptr + offset);
+ offset += MAXALIGN(nbanks * sizeof(int));
- /* Should fit to estimated shmem size */
- Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+ if (nlsns > 0)
+ {
+ shared->group_lsn = (XLogRecPtr *) (ptr + offset);
+ offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
}
- else
+
+ ptr += BUFFERALIGN(offset);
+ for (int slotno = 0; slotno < nslots; slotno++)
{
- Assert(found);
- Assert(shared->num_slots == nslots);
+ LWLockInitialize(&shared->buffer_locks[slotno].lock,
+ desc->options.buffer_tranche_id);
+
+ shared->page_buffer[slotno] = ptr;
+ shared->page_status[slotno] = SLRU_PAGE_EMPTY;
+ shared->page_dirty[slotno] = false;
+ shared->page_lru_count[slotno] = 0;
+ ptr += BLCKSZ;
}
- /*
- * Initialize the unshared control struct, including directory path. We
- * assume caller set PagePrecedes.
- */
- ctl->shared = shared;
- ctl->sync_handler = sync_handler;
- ctl->long_segment_names = long_segment_names;
- ctl->nbanks = nbanks;
- strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir));
+ /* Initialize the slot banks. */
+ for (int bankno = 0; bankno < nbanks; bankno++)
+ {
+ LWLockInitialize(&shared->bank_locks[bankno].lock, desc->options.bank_tranche_id);
+ shared->bank_cur_lru_count[bankno] = 0;
+ }
+
+ /* Should fit to estimated shmem size */
+ Assert(ptr - (char *) shared <= SimpleLruShmemSize(nslots, nlsns));
+}
+
+void
+shmem_slru_attach(void *location, ShmemStructOpts *base_options)
+{
+ SlruOpts *options = (SlruOpts *) base_options;
+ SlruDesc *desc = (SlruDesc *) options->desc;
+ int nslots = options->nslots;
+ int nbanks = nslots / SLRU_BANK_SIZE;
+
+ desc->shared = (SlruShared) location;
+ desc->nbanks = nbanks;
+ memcpy(&desc->options, options, sizeof(SlruOpts));
}
+
/*
* Helper function for GUC check_hook to check whether slru buffers are in
* multiples of SLRU_BANK_SIZE.
@@ -377,7 +394,7 @@ check_slru_buffers(const char *name, int *newval)
* Bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
int slotno;
@@ -430,7 +447,7 @@ SimpleLruZeroPage(SlruCtl ctl, int64 pageno)
* This assumes that InvalidXLogRecPtr is bitwise-all-0.
*/
static void
-SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
+SimpleLruZeroLSNs(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
@@ -446,7 +463,7 @@ SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
* SLRU bank lock is acquired and released here.
*/
void
-SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
+SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno)
{
int slotno;
LWLock *lock;
@@ -472,7 +489,7 @@ SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SimpleLruWaitIO(SlruCtl ctl, int slotno)
+SimpleLruWaitIO(SlruDesc *ctl, int slotno)
{
SlruShared shared = ctl->shared;
int bankno = SlotGetBankNumber(slotno);
@@ -530,7 +547,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
int
-SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data)
{
SlruShared shared = ctl->shared;
@@ -634,7 +651,7 @@ SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
* It is unspecified whether the lock will be shared or exclusive.
*/
int
-SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
SlruShared shared = ctl->shared;
LWLock *banklock = SimpleLruGetBankLock(ctl, pageno);
@@ -681,7 +698,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno, const void *opaque_data)
* Bank lock must be held at entry, and will be held at exit.
*/
static void
-SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
+SlruInternalWritePage(SlruDesc *ctl, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 pageno = shared->page_number[slotno];
@@ -761,7 +778,7 @@ SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata)
* fdata is always passed a NULL here.
*/
void
-SimpleLruWritePage(SlruCtl ctl, int slotno)
+SimpleLruWritePage(SlruDesc *ctl, int slotno)
{
Assert(ctl->shared->page_status[slotno] != SLRU_PAGE_EMPTY);
@@ -775,7 +792,7 @@ SimpleLruWritePage(SlruCtl ctl, int slotno)
* large enough to contain the given page.
*/
bool
-SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
+SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -833,7 +850,7 @@ SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno)
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
-SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
+SlruPhysicalReadPage(SlruDesc *ctl, int64 pageno, int slotno)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -905,7 +922,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int64 pageno, int slotno)
* SimpleLruWriteAll.
*/
static bool
-SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
+SlruPhysicalWritePage(SlruDesc *ctl, int64 pageno, int slotno, SlruWriteAll fdata)
{
SlruShared shared = ctl->shared;
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
@@ -1037,11 +1054,11 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
pgstat_report_wait_end();
/* Queue up a sync request for the checkpointer. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
if (!RegisterSyncRequest(&tag, SYNC_REQUEST, false))
{
/* No space to enqueue sync request. Do it synchronously. */
@@ -1077,7 +1094,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int64 pageno, int slotno, SlruWriteAll fdata)
* SlruPhysicalWritePage. Call this after cleaning up shared-memory state.
*/
static void
-SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
+SlruReportIOError(SlruDesc *ctl, int64 pageno, const void *opaque_data)
{
int64 segno = pageno / SLRU_PAGES_PER_SEGMENT;
int rpageno = pageno % SLRU_PAGES_PER_SEGMENT;
@@ -1092,14 +1109,14 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m", path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_SEEK_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not seek in file \"%s\" to offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_READ_FAILED:
if (errno)
@@ -1107,12 +1124,12 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("could not read from file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("could not read from file \"%s\" at offset %d: read too few bytes",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_WRITE_FAILED:
if (errno)
@@ -1120,26 +1137,26 @@ SlruReportIOError(SlruCtl ctl, int64 pageno, const void *opaque_data)
(errcode_for_file_access(),
errmsg("Could not write to file \"%s\" at offset %d: %m",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
else
ereport(ERROR,
(errmsg("Could not write to file \"%s\" at offset %d: wrote too few bytes.",
path, offset),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_FSYNC_FAILED:
ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
case SLRU_CLOSE_FAILED:
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not close file \"%s\": %m",
path),
- opaque_data ? ctl->errdetail_for_io_error(opaque_data) : 0));
+ opaque_data ? ctl->options.errdetail_for_io_error(opaque_data) : 0));
break;
default:
/* can't get here, we trust */
@@ -1199,7 +1216,7 @@ SlruRecentlyUsed(SlruShared shared, int slotno)
* The correct bank lock must be held at entry, and will be held at exit.
*/
static int
-SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
+SlruSelectLRUPage(SlruDesc *ctl, int64 pageno)
{
SlruShared shared = ctl->shared;
@@ -1291,8 +1308,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_valid_delta ||
(this_delta == best_valid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_valid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_valid_page_number)))
{
bestvalidslot = slotno;
best_valid_delta = this_delta;
@@ -1303,8 +1320,8 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
{
if (this_delta > best_invalid_delta ||
(this_delta == best_invalid_delta &&
- ctl->PagePrecedes(this_page_number,
- best_invalid_page_number)))
+ ctl->options.PagePrecedes(this_page_number,
+ best_invalid_page_number)))
{
bestinvalidslot = slotno;
best_invalid_delta = this_delta;
@@ -1352,7 +1369,7 @@ SlruSelectLRUPage(SlruCtl ctl, int64 pageno)
* entries are on disk.
*/
void
-SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
+SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied)
{
SlruShared shared = ctl->shared;
SlruWriteAllData fdata;
@@ -1422,8 +1439,8 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
SlruReportIOError(ctl, pageno, NULL);
/* Ensure that directory entries for new files are on disk. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
- fsync_fname(ctl->Dir, true);
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
+ fsync_fname(ctl->options.Dir, true);
}
/*
@@ -1438,7 +1455,7 @@ SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
* after it has accrued freshly-written data.
*/
void
-SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage)
+SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage)
{
SlruShared shared = ctl->shared;
int prevbank;
@@ -1460,12 +1477,12 @@ restart:
* bugs elsewhere in SLRU handling, so we don't care if we read a slightly
* outdated value; therefore we don't add a memory barrier.
*/
- if (ctl->PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
- cutoffPage))
+ if (ctl->options.PagePrecedes(pg_atomic_read_u64(&shared->latest_page_number),
+ cutoffPage))
{
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
- ctl->Dir)));
+ ctl->options.Dir)));
return;
}
@@ -1488,7 +1505,7 @@ restart:
if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)
continue;
- if (!ctl->PagePrecedes(shared->page_number[slotno], cutoffPage))
+ if (!ctl->options.PagePrecedes(shared->page_number[slotno], cutoffPage))
continue;
/*
@@ -1533,16 +1550,16 @@ restart:
* they either can't yet contain anything, or have already been cleaned out.
*/
static void
-SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
+SlruInternalDeleteSegment(SlruDesc *ctl, int64 segno)
{
char path[MAXPGPATH];
/* Forget any fsync requests queued for this segment. */
- if (ctl->sync_handler != SYNC_HANDLER_NONE)
+ if (ctl->options.sync_handler != SYNC_HANDLER_NONE)
{
FileTag tag;
- INIT_SLRUFILETAG(tag, ctl->sync_handler, segno);
+ INIT_SLRUFILETAG(tag, ctl->options.sync_handler, segno);
RegisterSyncRequest(&tag, SYNC_FORGET_REQUEST, true);
}
@@ -1556,7 +1573,7 @@ SlruInternalDeleteSegment(SlruCtl ctl, int64 segno)
* Delete an individual SLRU segment, identified by the segment number.
*/
void
-SlruDeleteSegment(SlruCtl ctl, int64 segno)
+SlruDeleteSegment(SlruDesc *ctl, int64 segno)
{
SlruShared shared = ctl->shared;
int prevbank = SlotGetBankNumber(0);
@@ -1633,19 +1650,19 @@ restart:
* first>=cutoff && last>=cutoff: no; every page of this segment is too young
*/
static bool
-SlruMayDeleteSegment(SlruCtl ctl, int64 segpage, int64 cutoffPage)
+SlruMayDeleteSegment(SlruDesc *ctl, int64 segpage, int64 cutoffPage)
{
int64 seg_last_page = segpage + SLRU_PAGES_PER_SEGMENT - 1;
Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
- return (ctl->PagePrecedes(segpage, cutoffPage) &&
- ctl->PagePrecedes(seg_last_page, cutoffPage));
+ return (ctl->options.PagePrecedes(segpage, cutoffPage) &&
+ ctl->options.PagePrecedes(seg_last_page, cutoffPage));
}
#ifdef USE_ASSERT_CHECKING
static void
-SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
+SlruPagePrecedesTestOffset(SlruDesc *ctl, int per_page, uint32 offset)
{
TransactionId lhs,
rhs;
@@ -1654,6 +1671,9 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
TransactionId newestXact,
oldestXact;
+ /* This must be called after the Slru has been initialized */
+ Assert(ctl->options.PagePrecedes);
+
/*
* Compare an XID pair having undefined order (see RFC 1982), a pair at
* "opposite ends" of the XID space. TransactionIdPrecedes() treats each
@@ -1670,19 +1690,19 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
Assert(!TransactionIdPrecedes(rhs, lhs + 1));
Assert(!TransactionIdFollowsOrEquals(lhs, rhs));
Assert(!TransactionIdFollowsOrEquals(rhs, lhs));
- Assert(!ctl->PagePrecedes(lhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes(lhs / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, lhs / per_page));
- Assert(!ctl->PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
- Assert(ctl->PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes(lhs / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, lhs / per_page));
+ Assert(!ctl->options.PagePrecedes((lhs - per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 3 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 2 * per_page) / per_page));
+ Assert(ctl->options.PagePrecedes(rhs / per_page, (lhs - 1 * per_page) / per_page)
|| (1U << 31) % per_page != 0); /* See CommitTsPagePrecedes() */
- Assert(ctl->PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
+ Assert(ctl->options.PagePrecedes((lhs + 1 * per_page) / per_page, rhs / per_page)
|| (1U << 31) % per_page != 0);
- Assert(ctl->PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
- Assert(ctl->PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
- Assert(!ctl->PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 2 * per_page) / per_page, rhs / per_page));
+ Assert(ctl->options.PagePrecedes((lhs + 3 * per_page) / per_page, rhs / per_page));
+ Assert(!ctl->options.PagePrecedes(rhs / per_page, (lhs + per_page) / per_page));
/*
* GetNewTransactionId() has assigned the last XID it can safely use, and
@@ -1727,7 +1747,7 @@ SlruPagePrecedesTestOffset(SlruCtl ctl, int per_page, uint32 offset)
* do not apply to them.)
*/
void
-SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
+SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page)
{
/* Test first, middle and last entries of a page. */
SlruPagePrecedesTestOffset(ctl, per_page, 0);
@@ -1742,7 +1762,7 @@ SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page)
* one containing the page passed as "data".
*/
bool
-SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1758,7 +1778,7 @@ SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes segments prior to the one passed in as "data".
*/
static bool
-SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
+SlruScanDirCbDeleteCutoff(SlruDesc *ctl, char *filename, int64 segpage,
void *data)
{
int64 cutoffPage = *(int64 *) data;
@@ -1774,7 +1794,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int64 segpage,
* This callback deletes all segments.
*/
bool
-SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
+SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
SlruInternalDeleteSegment(ctl, segpage / SLRU_PAGES_PER_SEGMENT);
@@ -1788,9 +1808,9 @@ SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage, void *data)
* SLRU segment.
*/
static inline bool
-SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
+SlruCorrectSegmentFilenameLength(SlruDesc *ctl, size_t len)
{
- if (ctl->long_segment_names)
+ if (ctl->options.long_segment_names)
return (len == 15); /* see SlruFileName() */
else
@@ -1821,7 +1841,7 @@ SlruCorrectSegmentFilenameLength(SlruCtl ctl, size_t len)
* Note that no locking is applied.
*/
bool
-SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
+SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data)
{
bool retval = false;
DIR *cldir;
@@ -1829,8 +1849,8 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
int64 segno;
int64 segpage;
- cldir = AllocateDir(ctl->Dir);
- while ((clde = ReadDir(cldir, ctl->Dir)) != NULL)
+ cldir = AllocateDir(ctl->options.Dir);
+ while ((clde = ReadDir(cldir, ctl->options.Dir)) != NULL)
{
size_t len;
@@ -1843,7 +1863,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
segpage = segno * SLRU_PAGES_PER_SEGMENT;
elog(DEBUG2, "SlruScanDirectory invoking callback on %s/%s",
- ctl->Dir, clde->d_name);
+ ctl->options.Dir, clde->d_name);
retval = callback(ctl, clde->d_name, segpage, data);
if (retval)
break;
@@ -1861,7 +1881,7 @@ SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data)
* performs the fsync.
*/
int
-SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path)
+SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path)
{
int fd;
int save_errno;
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index c6ce71fc703..b79e648b899 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -33,6 +33,7 @@
#include "access/transam.h"
#include "miscadmin.h"
#include "pg_trace.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/snapmgr.h"
@@ -66,16 +67,22 @@ TransactionIdToPage(TransactionId xid)
#define TransactionIdToEntry(xid) ((xid) % (TransactionId) SUBTRANS_XACTS_PER_PAGE)
+static void SUBTRANSShmemRequest(void *arg);
+static void SUBTRANSShmemInit(void *arg);
+static bool SubTransPagePrecedes(int64 page1, int64 page2);
+static int subtrans_errdetail_for_io_error(const void *opaque_data);
+
+const ShmemCallbacks SUBTRANSShmemCallbacks = {
+ .request_fn = SUBTRANSShmemRequest,
+ .init_fn = SUBTRANSShmemInit,
+};
+
/*
* Link to shared-memory data structures for SUBTRANS control
*/
-static SlruCtlData SubTransCtlData;
-
-#define SubTransCtl (&SubTransCtlData)
+static SlruDesc SubTransSlruDesc;
-
-static bool SubTransPagePrecedes(int64 page1, int64 page2);
-static int subtrans_errdetail_for_io_error(const void *opaque_data);
+#define SubTransCtl (&SubTransSlruDesc)
/*
@@ -207,17 +214,13 @@ SUBTRANSShmemBuffers(void)
return Min(Max(16, subtransaction_buffers), SLRU_MAX_ALLOWED_BUFFERS);
}
+
+
/*
- * Initialization of shared memory for SUBTRANS
+ * Register shared memory for SUBTRANS
*/
-Size
-SUBTRANSShmemSize(void)
-{
- return SimpleLruShmemSize(SUBTRANSShmemBuffers(), 0);
-}
-
-void
-SUBTRANSShmemInit(void)
+static void
+SUBTRANSShmemRequest(void *arg)
{
/* If auto-tuning is requested, now is the time to do it */
if (subtransaction_buffers == 0)
@@ -240,11 +243,25 @@ SUBTRANSShmemInit(void)
}
Assert(subtransaction_buffers != 0);
- SubTransCtl->PagePrecedes = SubTransPagePrecedes;
- SubTransCtl->errdetail_for_io_error = subtrans_errdetail_for_io_error;
- SimpleLruInit(SubTransCtl, "subtransaction", SUBTRANSShmemBuffers(), 0,
- "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER,
- LWTRANCHE_SUBTRANS_SLRU, SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SubTransSlruDesc,
+ .name = "subtransaction",
+ .Dir = "pg_subtrans",
+ .long_segment_names = false,
+
+ .nslots = SUBTRANSShmemBuffers(),
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SubTransPagePrecedes,
+ .errdetail_for_io_error = subtrans_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SUBTRANS_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SUBTRANS_SLRU,
+ );
+}
+
+static void
+SUBTRANSShmemInit(void *arg)
+{
SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE);
}
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index e91a62ff42a..db6a9a6561b 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -179,6 +179,7 @@
#include "storage/latch.h"
#include "storage/lmgr.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/dsa.h"
@@ -345,6 +346,15 @@ typedef struct AsyncQueueControl
static AsyncQueueControl *asyncQueueControl;
+static void AsyncShmemRequest(void *arg);
+static void AsyncShmemInit(void *arg);
+
+const ShmemCallbacks AsyncShmemCallbacks = {
+ .request_fn = AsyncShmemRequest,
+ .init_fn = AsyncShmemInit,
+};
+
+
#define QUEUE_HEAD (asyncQueueControl->head)
#define QUEUE_TAIL (asyncQueueControl->tail)
#define QUEUE_STOP_PAGE (asyncQueueControl->stopPage)
@@ -359,9 +369,13 @@ static AsyncQueueControl *asyncQueueControl;
/*
* The SLRU buffer area through which we access the notification queue
*/
-static SlruCtlData NotifyCtlData;
+static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
+static int asyncQueueErrdetailForIoError(const void *opaque_data);
+
+static SlruDesc NotifySlruDesc;
-#define NotifyCtl (&NotifyCtlData)
+
+#define NotifyCtl (&NotifySlruDesc)
#define QUEUE_PAGESIZE BLCKSZ
#define QUEUE_FULL_WARN_INTERVAL 5000 /* warn at most once every 5s */
@@ -570,9 +584,7 @@ bool Trace_notify = false;
int max_notify_queue_pages = 1048576;
/* local function prototypes */
-static int asyncQueueErrdetailForIoError(const void *opaque_data);
static inline int64 asyncQueuePageDiff(int64 p, int64 q);
-static inline bool asyncQueuePagePrecedes(int64 p, int64 q);
static inline void GlobalChannelKeyInit(GlobalChannelKey *key, Oid dboid,
const char *channel);
static dshash_hash globalChannelTableHash(const void *key, size_t size,
@@ -780,78 +792,63 @@ initPendingListenActions(void)
}
/*
- * Report space needed for our shared memory area
+ * Register our shared memory needs
*/
-Size
-AsyncShmemSize(void)
+static void
+AsyncShmemRequest(void *arg)
{
Size size;
- /* This had better match AsyncShmemInit */
size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
size = add_size(size, offsetof(AsyncQueueControl, backend));
- size = add_size(size, SimpleLruShmemSize(notify_buffers, 0));
+ ShmemRequestStruct(.name = "Async Queue Control",
+ .size = size,
+ .ptr = (void **) &asyncQueueControl,
+ );
- return size;
-}
+ SimpleLruRequest(.desc = &NotifySlruDesc,
+ .name = "notify",
+ .Dir = "pg_notify",
-/*
- * Initialize our shared memory area
- */
-void
-AsyncShmemInit(void)
-{
- bool found;
- Size size;
+ /* long segment names are used in order to avoid wraparound */
+ .long_segment_names = true,
- /*
- * Create or attach to the AsyncQueueControl structure.
- */
- size = mul_size(MaxBackends, sizeof(QueueBackendStatus));
- size = add_size(size, offsetof(AsyncQueueControl, backend));
+ .nslots = notify_buffers,
- asyncQueueControl = (AsyncQueueControl *)
- ShmemInitStruct("Async Queue Control", size, &found);
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = asyncQueuePagePrecedes,
+ .errdetail_for_io_error = asyncQueueErrdetailForIoError,
- if (!found)
+ .buffer_tranche_id = LWTRANCHE_NOTIFY_BUFFER,
+ .bank_tranche_id = LWTRANCHE_NOTIFY_SLRU,
+ );
+}
+
+static void
+AsyncShmemInit(void *arg)
+{
+ SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
+ SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
+ QUEUE_STOP_PAGE = 0;
+ QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
+ asyncQueueControl->lastQueueFillWarn = 0;
+ asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
+ asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
+ for (int i = 0; i < MaxBackends; i++)
{
- /* First time through, so initialize it */
- SET_QUEUE_POS(QUEUE_HEAD, 0, 0);
- SET_QUEUE_POS(QUEUE_TAIL, 0, 0);
- QUEUE_STOP_PAGE = 0;
- QUEUE_FIRST_LISTENER = INVALID_PROC_NUMBER;
- asyncQueueControl->lastQueueFillWarn = 0;
- asyncQueueControl->globalChannelTableDSA = DSA_HANDLE_INVALID;
- asyncQueueControl->globalChannelTableDSH = DSHASH_HANDLE_INVALID;
- for (int i = 0; i < MaxBackends; i++)
- {
- QUEUE_BACKEND_PID(i) = InvalidPid;
- QUEUE_BACKEND_DBOID(i) = InvalidOid;
- QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
- SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
- QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
- QUEUE_BACKEND_IS_ADVANCING(i) = false;
- }
+ QUEUE_BACKEND_PID(i) = InvalidPid;
+ QUEUE_BACKEND_DBOID(i) = InvalidOid;
+ QUEUE_NEXT_LISTENER(i) = INVALID_PROC_NUMBER;
+ SET_QUEUE_POS(QUEUE_BACKEND_POS(i), 0, 0);
+ QUEUE_BACKEND_WAKEUP_PENDING(i) = false;
+ QUEUE_BACKEND_IS_ADVANCING(i) = false;
}
/*
- * Set up SLRU management of the pg_notify data. Note that long segment
- * names are used in order to avoid wraparound.
+ * During start or reboot, clean out the pg_notify directory.
*/
- NotifyCtl->PagePrecedes = asyncQueuePagePrecedes;
- NotifyCtl->errdetail_for_io_error = asyncQueueErrdetailForIoError;
- SimpleLruInit(NotifyCtl, "notify", notify_buffers, 0,
- "pg_notify", LWTRANCHE_NOTIFY_BUFFER, LWTRANCHE_NOTIFY_SLRU,
- SYNC_HANDLER_NONE, true);
-
- if (!found)
- {
- /*
- * During start or reboot, clean out the pg_notify directory.
- */
- (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
- }
+ (void) SlruScanDirectory(NotifyCtl, SlruScanDirCbDeleteAll, NULL);
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 4f707158303..7a8c69de802 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,16 +101,11 @@ CalculateShmemSize(void)
/* legacy subsystems */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
- size = add_size(size, PredicateLockShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, CLOGShmemSize());
- size = add_size(size, CommitTsShmemSize());
- size = add_size(size, SUBTRANSShmemSize());
size = add_size(size, TwoPhaseShmemSize());
size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, MultiXactShmemSize());
size = add_size(size, BackendStatusShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
@@ -123,7 +118,6 @@ CalculateShmemSize(void)
size = add_size(size, ApplyLauncherShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
- size = add_size(size, AsyncShmemSize());
size = add_size(size, StatsShmemSize());
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
@@ -270,10 +264,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- CLOGShmemInit();
- CommitTsShmemInit();
- SUBTRANSShmemInit();
- MultiXactShmemInit();
BufferManagerShmemInit();
/*
@@ -281,11 +271,6 @@ CreateOrAttachShmemStructs(void)
*/
LockManagerShmemInit();
- /*
- * Set up predicate lock manager
- */
- PredicateLockShmemInit();
-
/*
* Set up process table
*/
@@ -313,7 +298,6 @@ CreateOrAttachShmemStructs(void)
*/
BTreeShmemInit();
SyncScanShmemInit();
- AsyncShmemInit();
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 601c618d86c..67fad32be82 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -131,6 +131,7 @@
#include <unistd.h>
+#include "access/slru.h"
#include "common/int.h"
#include "fmgr.h"
#include "funcapi.h"
@@ -548,6 +549,9 @@ InitShmemIndexEntry(ShmemRequest *request)
case SHMEM_KIND_HASH:
shmem_hash_init(structPtr, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_init(structPtr, request->options);
+ break;
}
}
@@ -601,6 +605,9 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
case SHMEM_KIND_HASH:
shmem_hash_attach(index_entry->location, request->options);
break;
+ case SHMEM_KIND_SLRU:
+ shmem_slru_attach(index_entry->location, request->options);
+ break;
}
return true;
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index b509fbb2759..e063a701ef1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -152,10 +152,6 @@
/*
* INTERFACE ROUTINES
*
- * housekeeping for setting up shared memory predicate lock structures
- * PredicateLockShmemInit(void)
- * PredicateLockShmemSize(void)
- *
* predicate lock reporting
* GetPredicateLockStatusData(void)
* PageIsPredicateLocked(Relation relation, BlockNumber blkno)
@@ -211,6 +207,8 @@
#include "storage/predicate_internals.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc_hooks.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -322,9 +320,12 @@
/*
* The SLRU buffer area through which we access the old xids.
*/
-static SlruCtlData SerialSlruCtlData;
+static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
+static int serial_errdetail_for_io_error(const void *opaque_data);
-#define SerialSlruCtl (&SerialSlruCtlData)
+static SlruDesc SerialSlruDesc;
+
+#define SerialSlruCtl (&SerialSlruDesc)
#define SERIAL_PAGESIZE BLCKSZ
#define SERIAL_ENTRYSIZE sizeof(SerCommitSeqNo)
@@ -384,6 +385,17 @@ int max_predicate_locks_per_page; /* in guc_tables.c */
*/
static PredXactList PredXact;
+static void PredicateLockShmemRequest(void *arg);
+static void PredicateLockShmemInit(void *arg);
+static void PredicateLockShmemAttach(void *arg);
+
+const ShmemCallbacks PredicateLockShmemCallbacks = {
+ .request_fn = PredicateLockShmemRequest,
+ .init_fn = PredicateLockShmemInit,
+ .attach_fn = PredicateLockShmemAttach,
+};
+
+
/*
* This provides a pool of RWConflict data elements to use in conflict lists
* between transactions.
@@ -431,6 +443,8 @@ static bool MyXactDidWrite = false;
*/
static SERIALIZABLEXACT *SavedSerializableXact = InvalidSerializableXact;
+static int64 max_serializable_xacts;
+
/* local functions */
static SERIALIZABLEXACT *CreatePredXact(void);
@@ -442,13 +456,12 @@ static void SetPossibleUnsafeConflict(SERIALIZABLEXACT *roXact, SERIALIZABLEXACT
static void ReleaseRWConflict(RWConflict conflict);
static void FlagSxactUnsafe(SERIALIZABLEXACT *sxact);
-static bool SerialPagePrecedesLogically(int64 page1, int64 page2);
-static int serial_errdetail_for_io_error(const void *opaque_data);
static void SerialAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo);
static SerCommitSeqNo SerialGetMinConflictCommitSeqNo(TransactionId xid);
static void SerialSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize);
+
static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot origSnapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
@@ -1100,71 +1113,53 @@ CheckPointPredicate(void)
/*------------------------------------------------------------------------*/
/*
- * PredicateLockShmemInit -- Initialize the predicate locking data structures.
- *
- * This is called from CreateSharedMemoryAndSemaphores(), which see for
- * more comments. In the normal postmaster case, the shared hash tables
- * are created here. Backends inherit the pointers
- * to the shared tables via fork(). In the EXEC_BACKEND case, each
- * backend re-executes this code to obtain pointers to the already existing
- * shared hash tables.
+ * PredicateLockShmemRequest -- Register the predicate locking data structures.
*/
-void
-PredicateLockShmemInit(void)
+static void
+PredicateLockShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_predicate_lock_targets;
int64 max_predicate_locks;
- int64 max_serializable_xacts;
int64 max_rw_conflicts;
- Size requestSize;
- bool found;
-
-#ifndef EXEC_BACKEND
- Assert(!IsUnderPostmaster);
-#endif
/*
- * Compute size of predicate lock target hashtable. Note these
- * calculations must agree with PredicateLockShmemSize!
+ * Hash tables and other structs are set up by ShmemInitRegistered() /
+ * ShmemAttachRegistered() via registered descriptors in
+ * PredicateLockShmemRegister(). Here we do the remaining initialization
+ * that can't be done in a callback.
*/
max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
/*
- * Allocate hash table for PREDICATELOCKTARGET structs. This stores
+ * Register hash table for PREDICATELOCKTARGET structs. This stores
* per-predicate-lock-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTARGETTAG);
- info.entrysize = sizeof(PREDICATELOCKTARGET);
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
-
- PredicateLockTargetHash = ShmemInitHash("PREDICATELOCKTARGET hash",
- max_predicate_lock_targets,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
-
- /* Pre-calculate the hash and partition lock of the scratch entry */
- ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
- ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+ ShmemRequestHash(.name = "PREDICATELOCKTARGET hash",
+ .nelems = max_predicate_lock_targets,
+ .ptr = &PredicateLockTargetHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTARGETTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCKTARGET),
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Allocate hash table for PREDICATELOCK structs. This stores per
* xact-lock-of-a-target information.
*/
- info.keysize = sizeof(PREDICATELOCKTAG);
- info.entrysize = sizeof(PREDICATELOCK);
- info.hash = predicatelock_hash;
- info.num_partitions = NUM_PREDICATELOCK_PARTITIONS;
/* Assume an average of 2 xacts per target */
max_predicate_locks = max_predicate_lock_targets * 2;
- PredicateLockHash = ShmemInitHash("PREDICATELOCK hash",
- max_predicate_locks,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "PREDICATELOCK hash",
+ .nelems = max_predicate_locks,
+ .ptr = &PredicateLockHash,
+ .hash_info.keysize = sizeof(PREDICATELOCKTAG),
+ .hash_info.entrysize = sizeof(PREDICATELOCK),
+ .hash_info.hash = predicatelock_hash,
+ .hash_info.num_partitions = NUM_PREDICATELOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
/*
* Compute size for serializable transaction hashtable. Note these
@@ -1177,29 +1172,27 @@ PredicateLockShmemInit(void)
max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
/*
- * Allocate a list to hold information on transactions participating in
+ * Register a list to hold information on transactions participating in
* predicate locking.
*/
- requestSize = add_size(PredXactListDataSize,
- (mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT))));
- PredXact = ShmemInitStruct("PredXactList",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "PredXactList",
+ .size = add_size(PredXactListDataSize,
+ (mul_size((Size) max_serializable_xacts,
+ sizeof(SERIALIZABLEXACT)))),
+ .ptr = (void **) &PredXact,
+ );
/*
- * Allocate hash table for SERIALIZABLEXID structs. This stores per-xid
+ * Register hash table for SERIALIZABLEXID structs. This stores per-xid
* information for serializable transactions which have accessed data.
*/
- info.keysize = sizeof(SERIALIZABLEXIDTAG);
- info.entrysize = sizeof(SERIALIZABLEXID);
-
- SerializableXidHash = ShmemInitHash("SERIALIZABLEXID hash",
- max_serializable_xacts,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "SERIALIZABLEXID hash",
+ .nelems = max_serializable_xacts,
+ .ptr = &SerializableXidHash,
+ .hash_info.keysize = sizeof(SERIALIZABLEXIDTAG),
+ .hash_info.entrysize = sizeof(SERIALIZABLEXID),
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_FIXED_SIZE,
+ );
/*
* Allocate space for tracking rw-conflicts in lists attached to the
@@ -1214,58 +1207,50 @@ PredicateLockShmemInit(void)
*/
max_rw_conflicts = max_serializable_xacts * 5;
- requestSize = RWConflictPoolHeaderDataSize +
- mul_size((Size) max_rw_conflicts,
- RWConflictDataSize);
-
- RWConflictPool = ShmemInitStruct("RWConflictPool",
- requestSize,
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "RWConflictPool",
+ .size = RWConflictPoolHeaderDataSize + mul_size((Size) max_rw_conflicts,
+ RWConflictDataSize),
+ .ptr = (void **) &RWConflictPool,
+ );
- /*
- * Create or attach to the header for the list of finished serializable
- * transactions.
- */
- FinishedSerializableTransactions = (dlist_head *)
- ShmemInitStruct("FinishedSerializableTransactions",
- sizeof(dlist_head),
- &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "FinishedSerializableTransactions",
+ .size = sizeof(dlist_head),
+ .ptr = (void **) &FinishedSerializableTransactions,
+ );
/*
* Initialize the SLRU storage for old committed serializable
* transactions.
*/
- SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically;
- SerialSlruCtl->errdetail_for_io_error = serial_errdetail_for_io_error;
- SimpleLruInit(SerialSlruCtl, "serializable",
- serializable_buffers, 0, "pg_serial",
- LWTRANCHE_SERIAL_BUFFER, LWTRANCHE_SERIAL_SLRU,
- SYNC_HANDLER_NONE, false);
+ SimpleLruRequest(.desc = &SerialSlruDesc,
+ .name = "serializable",
+ .Dir = "pg_serial",
+ .long_segment_names = false,
+
+ .nslots = serializable_buffers,
+
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = SerialPagePrecedesLogically,
+ .errdetail_for_io_error = serial_errdetail_for_io_error,
+
+ .buffer_tranche_id = LWTRANCHE_SERIAL_BUFFER,
+ .bank_tranche_id = LWTRANCHE_SERIAL_SLRU,
+ );
#ifdef USE_ASSERT_CHECKING
SerialPagePrecedesLogicallyUnitTests();
#endif
- SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /*
- * Create or attach to the SerialControl structure.
- */
- serialControl = (SerialControl)
- ShmemInitStruct("SerialControlData", sizeof(SerialControlData), &found);
- Assert(found == IsUnderPostmaster);
+ ShmemRequestStruct(.name = "SerialControlData",
+ .size = sizeof(SerialControlData),
+ .ptr = (void **) &serialControl,
+ );
+}
- /*
- * If we just attached to existing shared memory (EXEC_BACKEND), we're all
- * done. Otherwise, during postmaster startup, proceed to initialize all
- * the shared memory areas that we allocated.
- */
- if (IsUnderPostmaster)
- {
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
- return;
- }
+static void
+PredicateLockShmemInit(void *arg)
+{
+ int max_rw_conflicts;
+ bool found;
/*
* Reserve a dummy entry in the hash table; we use it to make sure there's
@@ -1277,7 +1262,6 @@ PredicateLockShmemInit(void)
HASH_ENTER, &found);
Assert(!found);
- /* Initialize PredXact list */
dlist_init(&PredXact->availableList);
dlist_init(&PredXact->activeList);
PredXact->SxactGlobalXmin = InvalidTransactionId;
@@ -1312,13 +1296,13 @@ PredicateLockShmemInit(void)
PredXact->OldCommittedSxact->pid = 0;
PredXact->OldCommittedSxact->pgprocno = INVALID_PROC_NUMBER;
- /* This never changes, so let's keep a local copy. */
- OldCommittedSxact = PredXact->OldCommittedSxact;
-
/* Initialize the rw-conflict pool */
dlist_init(&RWConflictPool->availableList);
RWConflictPool->element = (RWConflict) ((char *) RWConflictPool +
RWConflictPoolHeaderDataSize);
+
+ max_rw_conflicts = max_serializable_xacts * 5;
+
/* Add all elements to available list, clean. */
for (int i = 0; i < max_rw_conflicts; i++)
{
@@ -1335,57 +1319,28 @@ PredicateLockShmemInit(void)
serialControl->headXid = InvalidTransactionId;
serialControl->tailXid = InvalidTransactionId;
LWLockRelease(SerialControlLock);
-}
-/*
- * Estimate shared-memory space used for predicate lock table
- */
-Size
-PredicateLockShmemSize(void)
-{
- Size size = 0;
- int64 max_predicate_lock_targets;
- int64 max_predicate_locks;
- int64 max_serializable_xacts;
- int64 max_rw_conflicts;
-
- /* predicate lock target hash table */
- max_predicate_lock_targets = NPREDICATELOCKTARGETENTS();
- size = add_size(size, hash_estimate_size(max_predicate_lock_targets,
- sizeof(PREDICATELOCKTARGET)));
-
- /* predicate lock hash table */
- max_predicate_locks = max_predicate_lock_targets * 2;
- size = add_size(size, hash_estimate_size(max_predicate_locks,
- sizeof(PREDICATELOCK)));
-
- /* transaction list */
- max_serializable_xacts = (MaxBackends + max_prepared_xacts) * 10;
- size = add_size(size, PredXactListDataSize);
- size = add_size(size, mul_size((Size) max_serializable_xacts,
- sizeof(SERIALIZABLEXACT)));
-
- /* transaction xid table */
- size = add_size(size, hash_estimate_size(max_serializable_xacts,
- sizeof(SERIALIZABLEXID)));
+ SlruPagePrecedesUnitTests(SerialSlruCtl, SERIAL_ENTRIESPERPAGE);
- /* rw-conflict pool */
- max_rw_conflicts = max_serializable_xacts * 5;
- size = add_size(size, RWConflictPoolHeaderDataSize);
- size = add_size(size, mul_size((Size) max_rw_conflicts,
- RWConflictDataSize));
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- /* Head for list of finished serializable transactions. */
- size = add_size(size, sizeof(dlist_head));
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
+}
- /* Shared memory structures for SLRU tracking of old committed xids. */
- size = add_size(size, sizeof(SerialControlData));
- size = add_size(size, SimpleLruShmemSize(serializable_buffers, 0));
+static void
+PredicateLockShmemAttach(void *arg)
+{
+ /* This never changes, so let's keep a local copy. */
+ OldCommittedSxact = PredXact->OldCommittedSxact;
- return size;
+ /* Pre-calculate the hash and partition lock of the scratch entry */
+ ScratchTargetTagHash = PredicateLockTargetTagHashCode(&ScratchTargetTag);
+ ScratchPartitionLock = PredicateLockHashPartitionLock(ScratchTargetTagHash);
}
-
/*
* Compute the hash code associated with a PREDICATELOCKTAG.
*
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..f4dfe8697d7 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -119,6 +119,7 @@ pgstat_get_slru_index(const char *name)
{
int i;
+ Assert(name);
for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
{
if (strcmp(slru_names[i], name) == 0)
diff --git a/src/include/access/clog.h b/src/include/access/clog.h
index a1cfed5f43c..7894998c763 100644
--- a/src/include/access/clog.h
+++ b/src/include/access/clog.h
@@ -40,8 +40,6 @@ extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
-extern Size CLOGShmemSize(void);
-extern void CLOGShmemInit(void);
extern void BootStrapCLOG(void);
extern void StartupCLOG(void);
extern void TrimCLOG(void);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index 49ee21cd5d2..825ccda90ed 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -27,8 +27,6 @@ extern bool TransactionIdGetCommitTsData(TransactionId xid,
extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
ReplOriginId *nodeid);
-extern Size CommitTsShmemSize(void);
-extern void CommitTsShmemInit(void);
extern void BootStrapCommitTs(void);
extern void StartupCommitTs(void);
extern void CommitTsParameterChange(bool newvalue, bool oldvalue);
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 2ae8b571dcc..6be5299ab68 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -121,8 +121,6 @@ extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
extern void PostPrepare_MultiXact(FullTransactionId fxid);
-extern Size MultiXactShmemSize(void);
-extern void MultiXactShmemInit(void);
extern void BootStrapMultiXact(void);
extern void StartupMultiXact(void);
extern void TrimMultiXact(void);
diff --git a/src/include/access/slru.h b/src/include/access/slru.h
index f966d0d9fe7..36a7514d7a0 100644
--- a/src/include/access/slru.h
+++ b/src/include/access/slru.h
@@ -16,6 +16,7 @@
#include "access/transam.h"
#include "access/xlogdefs.h"
#include "storage/lwlock.h"
+#include "storage/shmem.h"
#include "storage/sync.h"
/*
@@ -106,23 +107,28 @@ typedef struct SlruSharedData
typedef SlruSharedData *SlruShared;
-/*
- * SlruCtlData is an unshared structure that points to the active information
- * in shared memory.
- */
-typedef struct SlruCtlData
+typedef struct SlruDesc SlruDesc;
+
+typedef struct SlruOpts
{
- SlruShared shared;
+ ShmemStructOpts base;
- /* Number of banks in this SLRU. */
- uint16 nbanks;
+ /*
+ * name of SLRU. (This is user-visible, pick with care!)
+ */
+ const char *name;
/*
- * If true, use long segment file names. Otherwise, use short file names.
- *
- * For details about the file name format, see SlruFileName().
+ * Pointer to a backend-private handle for the SLRU. It is initialized in
+ * when the SLRU is initialized or attached to.
*/
- bool long_segment_names;
+ SlruDesc *desc;
+
+ /* number of page slots to use. */
+ int nslots;
+
+ /* number of LSN groups per page (set to zero if not relevant). */
+ int nlsns;
/*
* Which sync handler function to use when handing sync requests over to
@@ -130,6 +136,19 @@ typedef struct SlruCtlData
*/
SyncRequestHandler sync_handler;
+ /*
+ * PGDATA-relative subdirectory that will contain the files.
+ */
+ const char *Dir;
+
+ /*
+ * If true, use long segment file names. Otherwise, use short file names.
+ *
+ * For details about the file name format, see SlruFileName().
+ */
+ bool long_segment_names;
+
+
/*
* Decide whether a page is "older" for truncation and as a hint for
* evicting pages in LRU order. Return true if every entry of the first
@@ -153,13 +172,26 @@ typedef struct SlruCtlData
int (*errdetail_for_io_error) (const void *opaque_data);
/*
- * Dir is set during SimpleLruInit and does not change thereafter. Since
- * it's always the same, it doesn't need to be in shared memory.
+ * Tranche IDs to use for the SLRU's per-buffer and per-bank LWLocks. If
+ * these are left as zeros, new tranches will be assigned dynamically.
*/
- char Dir[64];
-} SlruCtlData;
+ int buffer_tranche_id;
+ int bank_tranche_id;
+} SlruOpts;
-typedef SlruCtlData *SlruCtl;
+/*
+ * SlruDesc is an unshared structure that points to the active information
+ * in shared memory.
+ */
+typedef struct SlruDesc
+{
+ SlruOpts options;
+
+ SlruShared shared;
+
+ /* Number of banks in this SLRU. */
+ uint16 nbanks;
+} SlruDesc;
/*
* Get the SLRU bank lock for given SlruCtl and the pageno.
@@ -168,48 +200,52 @@ typedef SlruCtlData *SlruCtl;
* respective bank.
*/
static inline LWLock *
-SimpleLruGetBankLock(SlruCtl ctl, int64 pageno)
+SimpleLruGetBankLock(SlruDesc *ctl, int64 pageno)
{
int bankno;
+ Assert(ctl->nbanks != 0);
bankno = pageno % ctl->nbanks;
return &(ctl->shared->bank_locks[bankno].lock);
}
-extern Size SimpleLruShmemSize(int nslots, int nlsns);
+extern void SimpleLruRequestWithOpts(const SlruOpts *options);
+
+#define SimpleLruRequest(...) \
+ SimpleLruRequestWithOpts(&(SlruOpts){__VA_ARGS__})
+
extern int SimpleLruAutotuneBuffers(int divisor, int max);
-extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
- const char *subdir, int buffer_tranche_id,
- int bank_tranche_id, SyncRequestHandler sync_handler,
- bool long_segment_names);
-extern int SimpleLruZeroPage(SlruCtl ctl, int64 pageno);
-extern void SimpleLruZeroAndWritePage(SlruCtl ctl, int64 pageno);
-extern int SimpleLruReadPage(SlruCtl ctl, int64 pageno, bool write_ok,
+extern int SimpleLruZeroPage(SlruDesc *ctl, int64 pageno);
+extern void SimpleLruZeroAndWritePage(SlruDesc *ctl, int64 pageno);
+extern int SimpleLruReadPage(SlruDesc *ctl, int64 pageno, bool write_ok,
const void *opaque_data);
-extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int64 pageno,
+extern int SimpleLruReadPage_ReadOnly(SlruDesc *ctl, int64 pageno,
const void *opaque_data);
-extern void SimpleLruWritePage(SlruCtl ctl, int slotno);
-extern void SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied);
+extern void SimpleLruWritePage(SlruDesc *ctl, int slotno);
+extern void SimpleLruWriteAll(SlruDesc *ctl, bool allow_redirtied);
#ifdef USE_ASSERT_CHECKING
-extern void SlruPagePrecedesUnitTests(SlruCtl ctl, int per_page);
+extern void SlruPagePrecedesUnitTests(SlruDesc *ctl, int per_page);
#else
#define SlruPagePrecedesUnitTests(ctl, per_page) do {} while (0)
#endif
-extern void SimpleLruTruncate(SlruCtl ctl, int64 cutoffPage);
-extern bool SimpleLruDoesPhysicalPageExist(SlruCtl ctl, int64 pageno);
+extern void SimpleLruTruncate(SlruDesc *ctl, int64 cutoffPage);
+extern bool SimpleLruDoesPhysicalPageExist(SlruDesc *ctl, int64 pageno);
-typedef bool (*SlruScanCallback) (SlruCtl ctl, char *filename, int64 segpage,
+typedef bool (*SlruScanCallback) (SlruDesc *ctl, char *filename, int64 segpage,
void *data);
-extern bool SlruScanDirectory(SlruCtl ctl, SlruScanCallback callback, void *data);
-extern void SlruDeleteSegment(SlruCtl ctl, int64 segno);
+extern bool SlruScanDirectory(SlruDesc *ctl, SlruScanCallback callback, void *data);
+extern void SlruDeleteSegment(SlruDesc *ctl, int64 segno);
-extern int SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path);
+extern int SlruSyncFileTag(SlruDesc *ctl, const FileTag *ftag, char *path);
/* SlruScanDirectory public callbacks */
-extern bool SlruScanDirCbReportPresence(SlruCtl ctl, char *filename,
+extern bool SlruScanDirCbReportPresence(SlruDesc *ctl, char *filename,
int64 segpage, void *data);
-extern bool SlruScanDirCbDeleteAll(SlruCtl ctl, char *filename, int64 segpage,
+extern bool SlruScanDirCbDeleteAll(SlruDesc *ctl, char *filename, int64 segpage,
void *data);
extern bool check_slru_buffers(const char *name, int *newval);
+extern void shmem_slru_init(void *location, ShmemStructOpts *options);
+extern void shmem_slru_attach(void *location, ShmemStructOpts *options);
+
#endif /* SLRU_H */
diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h
index 11b7355dbdf..d986cd9e802 100644
--- a/src/include/access/subtrans.h
+++ b/src/include/access/subtrans.h
@@ -15,8 +15,6 @@ extern void SubTransSetParent(TransactionId xid, TransactionId parent);
extern TransactionId SubTransGetParent(TransactionId xid);
extern TransactionId SubTransGetTopmostTransaction(TransactionId xid);
-extern Size SUBTRANSShmemSize(void);
-extern void SUBTRANSShmemInit(void);
extern void BootStrapSUBTRANS(void);
extern void StartupSUBTRANS(TransactionId oldestActiveXID);
extern void CheckPointSUBTRANS(void);
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index 3baae7cb8dc..202e4aa5e74 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT bool Trace_notify;
extern PGDLLIMPORT int max_notify_queue_pages;
extern PGDLLIMPORT volatile sig_atomic_t notifyInterruptPending;
-extern Size AsyncShmemSize(void);
-extern void AsyncShmemInit(void);
-
extern void NotifyMyFrontEnd(const char *channel,
const char *payload,
int32 srcPid);
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index a5ac55b8f7e..443bffb58fd 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -41,11 +41,6 @@ typedef void *SerializableXactHandle;
/*
* function prototypes
*/
-
-/* housekeeping for shared memory predicate lock structures */
-extern void PredicateLockShmemInit(void);
-extern Size PredicateLockShmemSize(void);
-
extern void CheckPointPredicate(void);
/* predicate lock reporting */
diff --git a/src/include/storage/shmem_internal.h b/src/include/storage/shmem_internal.h
index fe12bf33439..7b259d33ccf 100644
--- a/src/include/storage/shmem_internal.h
+++ b/src/include/storage/shmem_internal.h
@@ -21,6 +21,7 @@ typedef enum
{
SHMEM_KIND_STRUCT = 0, /* plain, contiguous area of memory */
SHMEM_KIND_HASH, /* a hash table */
+ SHMEM_KIND_SLRU, /* SLRU buffers and control structures */
} ShmemRequestKind;
/* shmem.c */
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d62c29f1361..c199f18a27a 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,13 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+
+/* predicate lock manager */
+PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
@@ -43,3 +50,6 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+
+/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
diff --git a/src/test/modules/test_slru/test_slru.c b/src/test/modules/test_slru/test_slru.c
index e4bd2af0bf5..40efffdbf62 100644
--- a/src/test/modules/test_slru/test_slru.c
+++ b/src/test/modules/test_slru/test_slru.c
@@ -40,14 +40,22 @@ PG_FUNCTION_INFO_V1(test_slru_delete_all);
/* Number of SLRU page slots */
#define NUM_TEST_BUFFERS 16
-static SlruCtlData TestSlruCtlData;
-#define TestSlruCtl (&TestSlruCtlData)
+static void test_slru_shmem_request(void *arg);
+static bool test_slru_page_precedes_logically(int64 page1, int64 page2);
+static int test_slru_errdetail_for_io_error(const void *opaque_data);
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static const char *TestSlruDir = "pg_test_slru";
+
+static SlruDesc TestSlruDesc;
+
+static const ShmemCallbacks test_slru_shmem_callbacks = {
+ .request_fn = test_slru_shmem_request
+};
+
+#define TestSlruCtl (&TestSlruDesc)
static bool
-test_slru_scan_cb(SlruCtl ctl, char *filename, int64 segpage, void *data)
+test_slru_scan_cb(SlruDesc *ctl, char *filename, int64 segpage, void *data)
{
elog(NOTICE, "Calling test_slru_scan_cb()");
return SlruScanDirCbDeleteAll(ctl, filename, segpage, data);
@@ -190,20 +198,6 @@ test_slru_delete_all(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
-/*
- * Module load callbacks and initialization.
- */
-
-static void
-test_slru_shmem_request(void)
-{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- /* reserve shared memory for the test SLRU */
- RequestAddinShmemSpace(SimpleLruShmemSize(NUM_TEST_BUFFERS, 0));
-}
-
static bool
test_slru_page_precedes_logically(int64 page1, int64 page2)
{
@@ -218,60 +212,46 @@ test_slru_errdetail_for_io_error(const void *opaque_data)
return errdetail("Could not access test_slru entry %u.", xid);
}
-static void
-test_slru_shmem_startup(void)
+void
+_PG_init(void)
{
- /*
- * Short segments names are well tested elsewhere so in this test we are
- * focusing on long names.
- */
- const bool long_segment_names = true;
- const char slru_dir_name[] = "pg_test_slru";
- int test_tranche_id = -1;
- int test_buffer_tranche_id = -1;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
+ if (!process_shared_preload_libraries_in_progress)
+ ereport(ERROR,
+ (errmsg("cannot load \"%s\" after startup", "test_slru"),
+ errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
+ "test_slru")));
/*
* Create the SLRU directory if it does not exist yet, from the root of
* the data directory.
*/
- (void) MakePGDirectory(slru_dir_name);
+ (void) MakePGDirectory(TestSlruDir);
- /*
- * Initialize the SLRU facility. In EXEC_BACKEND builds, the
- * shmem_startup_hook is called in the postmaster and in each backend, but
- * we only need to generate the LWLock tranches once. Note that these
- * tranche ID variables are not used by SimpleLruInit() when
- * IsUnderPostmaster is true.
- */
- if (!IsUnderPostmaster)
- {
- test_tranche_id = LWLockNewTrancheId("test_slru_tranche");
- test_buffer_tranche_id = LWLockNewTrancheId("test_buffer_tranche");
- }
-
- TestSlruCtl->PagePrecedes = test_slru_page_precedes_logically;
- TestSlruCtl->errdetail_for_io_error = test_slru_errdetail_for_io_error;
- SimpleLruInit(TestSlruCtl, "TestSLRU",
- NUM_TEST_BUFFERS, 0, slru_dir_name,
- test_buffer_tranche_id, test_tranche_id, SYNC_HANDLER_NONE,
- long_segment_names);
+ RegisterShmemCallbacks(&test_slru_shmem_callbacks);
}
-void
-_PG_init(void)
+static void
+test_slru_shmem_request(void *arg)
{
- if (!process_shared_preload_libraries_in_progress)
- ereport(ERROR,
- (errmsg("cannot load \"%s\" after startup", "test_slru"),
- errdetail("\"%s\" must be loaded with \"shared_preload_libraries\".",
- "test_slru")));
+ SimpleLruRequest(.desc = &TestSlruDesc,
+ .name = "TestSLRU",
+ .Dir = TestSlruDir,
+
+ /*
+ * Short segments names are well tested elsewhere so in this test we are
+ * focusing on long names.
+ */
+ .long_segment_names = true,
+
+ .nslots = NUM_TEST_BUFFERS,
+ .nlsns = 0,
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_slru_shmem_request;
+ .sync_handler = SYNC_HANDLER_NONE,
+ .PagePrecedes = test_slru_page_precedes_logically,
+ .errdetail_for_io_error = test_slru_errdetail_for_io_error,
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_slru_shmem_startup;
+ /* let slru.c assign these */
+ .buffer_tranche_id = 0,
+ .bank_tranche_id = 0,
+ );
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 349cf4e9f12..f52a0fa1c72 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2903,9 +2903,9 @@ SlotInvalidationCauseMap
SlotNumber
SlotSyncCtxStruct
SlotSyncSkipReason
-SlruCtl
-SlruCtlData
+SlruDesc
SlruErrorCause
+SlruOpts
SlruPageStatus
SlruScanCallback
SlruSegState
--
2.47.3
[text/x-patch] v12-0010-Convert-AIO-to-use-the-shmem-allocation-function.patch (14.8K, 11-v12-0010-Convert-AIO-to-use-the-shmem-allocation-function.patch)
download | inline diff:
From aa1ec87b4562a052fe96c9854dd4f3089f1e1ec0 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:55:41 +0300
Subject: [PATCH v12 10/13] Convert AIO to use the shmem allocation functions
This replaces the "shmem_size" and "shmem_init" callbacks in the IO
methods table with the same ShmemCallback struct that we now use in
other subsystems
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/storage/aio/aio_init.c | 112 +++++++++++++---------
src/backend/storage/aio/method_io_uring.c | 39 ++++----
src/backend/storage/aio/method_worker.c | 84 +++++++++-------
src/backend/storage/ipc/ipci.c | 2 -
src/include/storage/aio_internal.h | 16 +---
src/include/storage/aio_subsys.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 143 insertions(+), 117 deletions(-)
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index d3c68d8b04c..9b695abf013 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -23,16 +23,24 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
+static void AioShmemRequest(void *arg);
+static void AioShmemInit(void *arg);
+static void AioShmemAttach(void *arg);
-static Size
-AioCtlShmemSize(void)
-{
- /* pgaio_ctl itself */
- return sizeof(PgAioCtl);
-}
+const ShmemCallbacks AioShmemCallbacks = {
+ .request_fn = AioShmemRequest,
+ .init_fn = AioShmemInit,
+ .attach_fn = AioShmemAttach,
+};
+
+static PgAioBackend *AioBackendShmemPtr;
+static PgAioHandle *AioHandleShmemPtr;
+static struct iovec *AioHandleIOVShmemPtr;
+static uint64 *AioHandleDataShmemPtr;
static uint32
AioProcs(void)
@@ -109,12 +117,15 @@ AioChooseMaxConcurrency(void)
return Min(max_proportional_pins, 64);
}
-Size
-AioShmemSize(void)
+/*
+ * Register shared memory area for AIO subsystem.
+ */
+static void
+AioShmemRequest(void *arg)
{
- Size sz = 0;
-
/*
+ * Resolve io_max_concurrency if not already done
+ *
* We prefer to report this value's source as PGC_S_DYNAMIC_DEFAULT.
* However, if the DBA explicitly set io_max_concurrency = -1 in the
* config file, then PGC_S_DYNAMIC_DEFAULT will fail to override that and
@@ -132,48 +143,52 @@ AioShmemSize(void)
PGC_S_OVERRIDE);
}
- sz = add_size(sz, AioCtlShmemSize());
- sz = add_size(sz, AioBackendShmemSize());
- sz = add_size(sz, AioHandleShmemSize());
- sz = add_size(sz, AioHandleIOVShmemSize());
- sz = add_size(sz, AioHandleDataShmemSize());
-
- /* Reserve space for method specific resources. */
- if (pgaio_method_ops->shmem_size)
- sz = add_size(sz, pgaio_method_ops->shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioCtl",
+ .size = sizeof(PgAioCtl),
+ .ptr = (void **) &pgaio_ctl,
+ );
+
+ ShmemRequestStruct(.name = "AioBackend",
+ .size = AioBackendShmemSize(),
+ .ptr = (void **) &AioBackendShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandle",
+ .size = AioHandleShmemSize(),
+ .ptr = (void **) &AioHandleShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleIOV",
+ .size = AioHandleIOVShmemSize(),
+ .ptr = (void **) &AioHandleIOVShmemPtr,
+ );
+
+ ShmemRequestStruct(.name = "AioHandleData",
+ .size = AioHandleDataShmemSize(),
+ .ptr = (void **) &AioHandleDataShmemPtr,
+ );
+
+ if (pgaio_method_ops->shmem_callbacks.request_fn)
+ pgaio_method_ops->shmem_callbacks.request_fn(pgaio_method_ops->shmem_callbacks.opaque_arg);
}
-void
-AioShmemInit(void)
+/*
+ * Initialize AIO shared memory during postmaster startup.
+ */
+static void
+AioShmemInit(void *arg)
{
- bool found;
uint32 io_handle_off = 0;
uint32 iovec_off = 0;
uint32 per_backend_iovecs = io_max_concurrency * io_max_combine_limit;
- pgaio_ctl = (PgAioCtl *)
- ShmemInitStruct("AioCtl", AioCtlShmemSize(), &found);
-
- if (found)
- goto out;
-
- memset(pgaio_ctl, 0, AioCtlShmemSize());
-
pgaio_ctl->io_handle_count = AioProcs() * io_max_concurrency;
pgaio_ctl->iovec_count = AioProcs() * per_backend_iovecs;
- pgaio_ctl->backend_state = (PgAioBackend *)
- ShmemInitStruct("AioBackend", AioBackendShmemSize(), &found);
-
- pgaio_ctl->io_handles = (PgAioHandle *)
- ShmemInitStruct("AioHandle", AioHandleShmemSize(), &found);
-
- pgaio_ctl->iovecs = (struct iovec *)
- ShmemInitStruct("AioHandleIOV", AioHandleIOVShmemSize(), &found);
- pgaio_ctl->handle_data = (uint64 *)
- ShmemInitStruct("AioHandleData", AioHandleDataShmemSize(), &found);
+ pgaio_ctl->backend_state = AioBackendShmemPtr;
+ pgaio_ctl->io_handles = AioHandleShmemPtr;
+ pgaio_ctl->iovecs = AioHandleIOVShmemPtr;
+ pgaio_ctl->handle_data = AioHandleDataShmemPtr;
for (int procno = 0; procno < AioProcs(); procno++)
{
@@ -208,10 +223,15 @@ AioShmemInit(void)
}
}
-out:
- /* Initialize IO method specific resources. */
- if (pgaio_method_ops->shmem_init)
- pgaio_method_ops->shmem_init(!found);
+ if (pgaio_method_ops->shmem_callbacks.init_fn)
+ pgaio_method_ops->shmem_callbacks.init_fn(pgaio_method_ops->shmem_callbacks.opaque_arg);
+}
+
+static void
+AioShmemAttach(void *arg)
+{
+ if (pgaio_method_ops->shmem_callbacks.attach_fn)
+ pgaio_method_ops->shmem_callbacks.attach_fn(pgaio_method_ops->shmem_callbacks.opaque_arg);
}
void
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index 9f76d2683c0..3295c59ed75 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -49,8 +49,8 @@
/* Entry points for IoMethodOps. */
-static size_t pgaio_uring_shmem_size(void);
-static void pgaio_uring_shmem_init(bool first_time);
+static void pgaio_uring_shmem_request(void *arg);
+static void pgaio_uring_shmem_init(void *arg);
static void pgaio_uring_init_backend(void);
static int pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
static void pgaio_uring_wait_one(PgAioHandle *ioh, uint64 ref_generation);
@@ -59,7 +59,6 @@ static void pgaio_uring_check_one(PgAioHandle *ioh, uint64 ref_generation);
/* helper functions */
static void pgaio_uring_sq_from_io(PgAioHandle *ioh, struct io_uring_sqe *sqe);
-
const IoMethodOps pgaio_uring_ops = {
/*
* While io_uring mostly is OK with FDs getting closed while the IO is in
@@ -70,8 +69,8 @@ const IoMethodOps pgaio_uring_ops = {
*/
.wait_on_fd_before_close = true,
- .shmem_size = pgaio_uring_shmem_size,
- .shmem_init = pgaio_uring_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_uring_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_uring_shmem_init,
.init_backend = pgaio_uring_init_backend,
.submit = pgaio_uring_submit,
@@ -267,23 +266,31 @@ pgaio_uring_shmem_size(void)
{
size_t sz;
+ sz = pgaio_uring_context_shmem_size();
+ sz = add_size(sz, pgaio_uring_ring_shmem_size());
+
+ return sz;
+}
+
+static void
+pgaio_uring_shmem_request(void *arg)
+{
/*
* Kernel and liburing support for various features influences how much
* shmem we need, perform the necessary checks.
*/
pgaio_uring_check_capabilities();
- sz = pgaio_uring_context_shmem_size();
- sz = add_size(sz, pgaio_uring_ring_shmem_size());
-
- return sz;
+ ShmemRequestStruct(.name = "AioUringContext",
+ .size = pgaio_uring_shmem_size(),
+ .ptr = (void **) &pgaio_uring_contexts,
+ );
}
static void
-pgaio_uring_shmem_init(bool first_time)
+pgaio_uring_shmem_init(void *arg)
{
int TotalProcs = pgaio_uring_procs();
- bool found;
char *shmem;
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
@@ -291,13 +298,11 @@ pgaio_uring_shmem_init(bool first_time)
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
- * one ShmemInitStruct().
+ * one combined allocation.
+ *
+ * pgaio_uring_contexts is already set to the base of the allocation.
*/
- shmem = ShmemInitStruct("AioUringContext", pgaio_uring_shmem_size(), &found);
- if (found)
- return;
-
- pgaio_uring_contexts = (PgAioUringContext *) shmem;
+ shmem = (char *) pgaio_uring_contexts;
shmem += pgaio_uring_context_shmem_size();
/* if supported, handle memory alignment / sizing for io_uring memory */
diff --git a/src/backend/storage/aio/method_worker.c b/src/backend/storage/aio/method_worker.c
index e24357a7a0a..eb636bf5ad9 100644
--- a/src/backend/storage/aio/method_worker.c
+++ b/src/backend/storage/aio/method_worker.c
@@ -41,6 +41,7 @@
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/injection_point.h"
#include "utils/memdebug.h"
@@ -73,16 +74,20 @@ typedef struct PgAioWorkerControl
} PgAioWorkerControl;
-static size_t pgaio_worker_shmem_size(void);
-static void pgaio_worker_shmem_init(bool first_time);
+static void pgaio_worker_shmem_request(void *arg);
+static void pgaio_worker_shmem_init(void *arg);
+static void pgaio_worker_shmem_attach(void *arg);
+
+static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static bool pgaio_worker_needs_synchronous_execution(PgAioHandle *ioh);
static int pgaio_worker_submit(uint16 num_staged_ios, PgAioHandle **staged_ios);
const IoMethodOps pgaio_worker_ops = {
- .shmem_size = pgaio_worker_shmem_size,
- .shmem_init = pgaio_worker_shmem_init,
+ .shmem_callbacks.request_fn = pgaio_worker_shmem_request,
+ .shmem_callbacks.init_fn = pgaio_worker_shmem_init,
+ .shmem_callbacks.attach_fn = pgaio_worker_shmem_attach,
.needs_synchronous_execution = pgaio_worker_needs_synchronous_execution,
.submit = pgaio_worker_submit,
@@ -95,7 +100,6 @@ int io_workers = 3;
static int io_worker_queue_size = 64;
static int MyIoWorkerId;
-static PgAioWorkerSubmissionQueue *io_worker_submission_queue;
static PgAioWorkerControl *io_worker_control;
@@ -116,50 +120,60 @@ pgaio_worker_control_shmem_size(void)
sizeof(PgAioWorkerSlot) * MAX_IO_WORKERS;
}
-static size_t
-pgaio_worker_shmem_size(void)
+/*
+ * Set secondary AIO worker pointer from the combined allocation.
+ */
+static void
+pgaio_worker_set_secondary_ptr(void)
{
- size_t sz;
int queue_size;
+ Size queue_sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = pgaio_worker_queue_shmem_size(&queue_size);
- sz = add_size(sz, pgaio_worker_control_shmem_size());
-
- return sz;
+ io_worker_control = (PgAioWorkerControl *)
+ ((char *) io_worker_submission_queue + MAXALIGN(queue_sz));
}
static void
-pgaio_worker_shmem_init(bool first_time)
+pgaio_worker_shmem_init(void *arg)
{
- bool found;
int queue_size;
- io_worker_submission_queue =
- ShmemInitStruct("AioWorkerSubmissionQueue",
- pgaio_worker_queue_shmem_size(&queue_size),
- &found);
- if (!found)
- {
- io_worker_submission_queue->size = queue_size;
- io_worker_submission_queue->head = 0;
- io_worker_submission_queue->tail = 0;
- }
+ pgaio_worker_queue_shmem_size(&queue_size);
+ io_worker_submission_queue->size = queue_size;
+ io_worker_submission_queue->head = 0;
+ io_worker_submission_queue->tail = 0;
- io_worker_control =
- ShmemInitStruct("AioWorkerControl",
- pgaio_worker_control_shmem_size(),
- &found);
- if (!found)
+ pgaio_worker_set_secondary_ptr();
+
+ io_worker_control->idle_worker_mask = 0;
+ for (int i = 0; i < MAX_IO_WORKERS; ++i)
{
- io_worker_control->idle_worker_mask = 0;
- for (int i = 0; i < MAX_IO_WORKERS; ++i)
- {
- io_worker_control->workers[i].latch = NULL;
- io_worker_control->workers[i].in_use = false;
- }
+ io_worker_control->workers[i].latch = NULL;
+ io_worker_control->workers[i].in_use = false;
}
}
+static void
+pgaio_worker_shmem_attach(void *arg)
+{
+ pgaio_worker_set_secondary_ptr();
+}
+
+static void
+pgaio_worker_shmem_request(void *arg)
+{
+ size_t size;
+ int queue_size;
+
+ size = MAXALIGN(pgaio_worker_queue_shmem_size(&queue_size)) +
+ pgaio_worker_control_shmem_size();
+
+ ShmemRequestStruct(.name = "AioWorkerSubmissionQueue",
+ .size = size,
+ .ptr = (void **) &io_worker_submission_queue,
+ );
+}
+
static int
pgaio_worker_choose_idle(void)
{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7a8c69de802..a510c928daa 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -122,7 +122,6 @@ CalculateShmemSize(void)
size = add_size(size, WaitEventCustomShmemSize());
size = add_size(size, InjectionPointShmemSize());
size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, AioShmemSize());
size = add_size(size, WaitLSNShmemSize());
size = add_size(size, LogicalDecodingCtlShmemSize());
size = add_size(size, DataChecksumsShmemSize());
@@ -301,7 +300,6 @@ CreateOrAttachShmemStructs(void)
StatsShmemInit();
WaitEventCustomShmemInit();
InjectionPointShmemInit();
- AioShmemInit();
WaitLSNShmemInit();
LogicalDecodingCtlShmemInit();
}
diff --git a/src/include/storage/aio_internal.h b/src/include/storage/aio_internal.h
index 33e1e2dc048..9ca4087aa7f 100644
--- a/src/include/storage/aio_internal.h
+++ b/src/include/storage/aio_internal.h
@@ -20,6 +20,8 @@
#include "port/pg_iovec.h"
#include "storage/aio.h"
#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
/*
@@ -267,20 +269,8 @@ typedef struct IoMethodOps
*/
bool wait_on_fd_before_close;
-
/* global initialization */
-
- /*
- * Amount of additional shared memory to reserve for the io_method. Called
- * just like a normal ipci.c style *Size() function. Optional.
- */
- size_t (*shmem_size) (void);
-
- /*
- * Initialize shared memory. First time is true if AIO's shared memory was
- * just initialized, false otherwise. Optional.
- */
- void (*shmem_init) (bool first_time);
+ ShmemCallbacks shmem_callbacks;
/*
* Per-backend initialization. Optional.
diff --git a/src/include/storage/aio_subsys.h b/src/include/storage/aio_subsys.h
index 276cb3e31c4..dd54869351f 100644
--- a/src/include/storage/aio_subsys.h
+++ b/src/include/storage/aio_subsys.h
@@ -20,12 +20,8 @@
/* aio_init.c */
-extern Size AioShmemSize(void);
-extern void AioShmemInit(void);
-
extern void pgaio_init_backend(void);
-
/* aio.c */
extern void pgaio_error_cleanup(void);
extern void AtEOXact_Aio(bool is_commit);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index c199f18a27a..b438794d46d 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -53,3 +53,6 @@ PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
/* other modules that need some shared memory space */
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+
+/* AIO subsystem. This delegates to the method-specific callbacks */
+PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
--
2.47.3
[text/x-patch] v12-0011-Add-alignment-option-to-ShmemRequestStruct.patch (5.3K, 12-v12-0011-Add-alignment-option-to-ShmemRequestStruct.patch)
download | inline diff:
From 217ffee38167229568a474051549123713ba3a9e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:56:00 +0300
Subject: [PATCH v12 11/13] Add alignment option to ShmemRequestStruct()
The buffer blocks (in the next commit) are IO-aligned. This might come
handy in other places too, so make it an explicit feature of
ShmemRequestStruct().
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/storage/ipc/shmem.c | 33 +++++++++++++++++++++++----------
src/include/storage/shmem.h | 6 ++++++
2 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 67fad32be82..0f103f17f29 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -136,6 +136,7 @@
#include "fmgr.h"
#include "funcapi.h"
#include "miscadmin.h"
+#include "port/pg_bitutils.h"
#include "port/pg_numa.h"
#include "storage/lwlock.h"
#include "storage/pg_shmem.h"
@@ -236,7 +237,7 @@ typedef struct ShmemAllocatorData
#define ShmemIndexLock (&ShmemAllocator->index_lock)
-static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void *ShmemAllocRaw(Size size, Size alignment, Size *allocated_size);
/* shared memory global variables */
@@ -340,6 +341,7 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
{
ShmemRequest *request;
+ /* Check the options */
if (options->name == NULL)
elog(ERROR, "shared memory request is missing 'name' option");
@@ -358,6 +360,11 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
options->size, options->name);
}
+ if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
+ elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
+ options->alignment, options->name);
+
+ /* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -399,7 +406,8 @@ ShmemGetRequestedSize(void)
{
size = add_size(size, request->options->size);
/* calculate alignment padding like ShmemAllocRaw() does */
- size = CACHELINEALIGN(size);
+ size = TYPEALIGN(Max(request->options->alignment, PG_CACHE_LINE_SIZE),
+ size);
}
return size;
@@ -524,7 +532,9 @@ InitShmemIndexEntry(ShmemRequest *request)
* We inserted the entry to the shared memory index. Allocate requested
* amount of shared memory for it, and initialize the index entry.
*/
- structPtr = ShmemAllocRaw(request->options->size, &allocated_size);
+ structPtr = ShmemAllocRaw(request->options->size,
+ request->options->alignment,
+ &allocated_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -753,7 +763,7 @@ ShmemAlloc(Size size)
void *newSpace;
Size allocated_size;
- newSpace = ShmemAllocRaw(size, &allocated_size);
+ newSpace = ShmemAllocRaw(size, 0, &allocated_size);
if (!newSpace)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
@@ -772,7 +782,7 @@ ShmemAllocNoError(Size size)
{
Size allocated_size;
- return ShmemAllocRaw(size, &allocated_size);
+ return ShmemAllocRaw(size, 0, &allocated_size);
}
/*
@@ -782,8 +792,9 @@ ShmemAllocNoError(Size size)
* be equal to the number requested plus any padding we choose to add.
*/
static void *
-ShmemAllocRaw(Size size, Size *allocated_size)
+ShmemAllocRaw(Size size, Size alignment, Size *allocated_size)
{
+ Size rawStart;
Size newStart;
Size newFree;
void *newSpace;
@@ -799,14 +810,15 @@ ShmemAllocRaw(Size size, Size *allocated_size)
* structures out to a power-of-two size - but without this, even that
* won't be sufficient.
*/
- size = CACHELINEALIGN(size);
- *allocated_size = size;
+ if (alignment < PG_CACHE_LINE_SIZE)
+ alignment = PG_CACHE_LINE_SIZE;
Assert(ShmemSegHdr != NULL);
SpinLockAcquire(&ShmemAllocator->shmem_lock);
- newStart = ShmemAllocator->free_offset;
+ rawStart = ShmemAllocator->free_offset;
+ newStart = TYPEALIGN(alignment, rawStart);
newFree = newStart + size;
if (newFree <= ShmemSegHdr->totalsize)
@@ -820,8 +832,9 @@ ShmemAllocRaw(Size size, Size *allocated_size)
SpinLockRelease(&ShmemAllocator->shmem_lock);
/* note this assert is okay with newSpace == NULL */
- Assert(newSpace == (void *) CACHELINEALIGN(newSpace));
+ Assert(newSpace == (void *) TYPEALIGN(alignment, newSpace));
+ *allocated_size = newFree - rawStart;
return newSpace;
}
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 8e0fc29dcac..060a4a8e5d2 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -51,6 +51,12 @@ typedef struct ShmemStructOpts
*/
ssize_t size;
+ /*
+ * Alignment of the starting address. If not set, defaults to cacheline
+ * boundary. Must be a power of two.
+ */
+ size_t alignment;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
--
2.47.3
[text/x-patch] v12-0012-Convert-buffer-manager-to-the-new-shmem-allocati.patch (16.2K, 13-v12-0012-Convert-buffer-manager-to-the-new-shmem-allocati.patch)
download | inline diff:
From 759b45df87b981498045ed524bcdd3d0c06b240d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 19:56:33 +0300
Subject: [PATCH v12 12/13] Convert buffer manager to the new shmem allocation
functions
This rectifies the initialization functions a little, making the
"buffer strategy" stuff in freelist.c and buffer mapping hash table in
buf_init.c top-level "subsystems" of their own, registered directly in
subsystemlist.h. Previously they were called indirectly from
BufferManagerShmemInit() and BufferManagerShmemSize()
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/storage/buffer/buf_init.c | 149 ++++++++++---------------
src/backend/storage/buffer/buf_table.c | 54 +++++----
src/backend/storage/buffer/freelist.c | 93 +++++----------
src/backend/storage/ipc/ipci.c | 3 -
src/include/storage/buf_internals.h | 5 -
src/include/storage/bufmgr.h | 4 -
src/include/storage/subsystemlist.h | 3 +
7 files changed, 124 insertions(+), 187 deletions(-)
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index c0c223b2e32..1407c930c56 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -18,6 +18,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proclist.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -25,6 +27,15 @@ ConditionVariableMinimallyPadded *BufferIOCVArray;
WritebackContext BackendWritebackContext;
CkptSortItem *CkptBufferIds;
+static void BufferManagerShmemRequest(void *arg);
+static void BufferManagerShmemInit(void *arg);
+static void BufferManagerShmemAttach(void *arg);
+
+const ShmemCallbacks BufferManagerShmemCallbacks = {
+ .request_fn = BufferManagerShmemRequest,
+ .init_fn = BufferManagerShmemInit,
+ .attach_fn = BufferManagerShmemAttach,
+};
/*
* Data Structures:
@@ -60,37 +71,31 @@ CkptSortItem *CkptBufferIds;
/*
- * Initialize shared buffer pool
- *
- * This is called once during shared-memory initialization (either in the
- * postmaster, or in a standalone backend).
+ * Register shared memory area for the buffer pool.
*/
-void
-BufferManagerShmemInit(void)
+static void
+BufferManagerShmemRequest(void *arg)
{
- bool foundBufs,
- foundDescs,
- foundIOCV,
- foundBufCkpt;
-
+ ShmemRequestStruct(.name = "Buffer Descriptors",
+ .size = NBuffers * sizeof(BufferDescPadded),
/* Align descriptors to a cacheline boundary. */
- BufferDescriptors = (BufferDescPadded *)
- ShmemInitStruct("Buffer Descriptors",
- NBuffers * sizeof(BufferDescPadded),
- &foundDescs);
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferDescriptors,
+ );
+ ShmemRequestStruct(.name = "Buffer Blocks",
+ .size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- BufferBlocks = (char *)
- TYPEALIGN(PG_IO_ALIGN_SIZE,
- ShmemInitStruct("Buffer Blocks",
- NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE,
- &foundBufs));
-
- /* Align condition variables to cacheline boundary. */
- BufferIOCVArray = (ConditionVariableMinimallyPadded *)
- ShmemInitStruct("Buffer IO Condition Variables",
- NBuffers * sizeof(ConditionVariableMinimallyPadded),
- &foundIOCV);
+ .alignment = PG_IO_ALIGN_SIZE,
+ .ptr = (void **) &BufferBlocks,
+ );
+
+ ShmemRequestStruct(.name = "Buffer IO Condition Variables",
+ .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
+ /* Align descriptors to a cacheline boundary. */
+ .alignment = PG_CACHE_LINE_SIZE,
+ .ptr = (void **) &BufferIOCVArray,
+ );
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -99,80 +104,50 @@ BufferManagerShmemInit(void)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- CkptBufferIds = (CkptSortItem *)
- ShmemInitStruct("Checkpoint BufferIds",
- NBuffers * sizeof(CkptSortItem), &foundBufCkpt);
+ ShmemRequestStruct(.name = "Checkpoint BufferIds",
+ .size = NBuffers * sizeof(CkptSortItem),
+ .ptr = (void **) &CkptBufferIds,
+ );
+}
- if (foundDescs || foundBufs || foundIOCV || foundBufCkpt)
- {
- /* should find all of these, or none of them */
- Assert(foundDescs && foundBufs && foundIOCV && foundBufCkpt);
- /* note: this path is only taken in EXEC_BACKEND case */
- }
- else
+/*
+ * Initialize shared buffer pool
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+static void
+BufferManagerShmemInit(void *arg)
+{
+ /*
+ * Initialize all the buffer headers.
+ */
+ for (int i = 0; i < NBuffers; i++)
{
- int i;
+ BufferDesc *buf = GetBufferDescriptor(i);
- /*
- * Initialize all the buffer headers.
- */
- for (i = 0; i < NBuffers; i++)
- {
- BufferDesc *buf = GetBufferDescriptor(i);
+ ClearBufferTag(&buf->tag);
- ClearBufferTag(&buf->tag);
+ pg_atomic_init_u64(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = i;
- buf->buf_id = i;
+ pgaio_wref_clear(&buf->io_wref);
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
- }
+ proclist_init(&buf->lock_waiters);
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
}
- /* Init other shared buffer-management stuff */
- StrategyInitialize(!foundDescs);
-
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
}
-/*
- * BufferManagerShmemSize
- *
- * compute the size of shared memory for the buffer pool including
- * data pages, buffer descriptors, hash tables, etc.
- */
-Size
-BufferManagerShmemSize(void)
+static void
+BufferManagerShmemAttach(void *arg)
{
- Size size = 0;
-
- /* size of buffer descriptors */
- size = add_size(size, mul_size(NBuffers, sizeof(BufferDescPadded)));
- /* to allow aligning buffer descriptors */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of data pages, plus alignment padding */
- size = add_size(size, PG_IO_ALIGN_SIZE);
- size = add_size(size, mul_size(NBuffers, BLCKSZ));
-
- /* size of stuff controlled by freelist.c */
- size = add_size(size, StrategyShmemSize());
-
- /* size of I/O condition variables */
- size = add_size(size, mul_size(NBuffers,
- sizeof(ConditionVariableMinimallyPadded)));
- /* to allow aligning the above */
- size = add_size(size, PG_CACHE_LINE_SIZE);
-
- /* size of checkpoint sort array in bufmgr.c */
- size = add_size(size, mul_size(NBuffers, sizeof(CkptSortItem)));
-
- return size;
+ /* Initialize per-backend file flush context */
+ WritebackContextInit(&BackendWritebackContext,
+ &backend_flush_after);
}
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index d04ef74b850..347bf267d73 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -22,6 +22,7 @@
#include "postgres.h"
#include "storage/buf_internals.h"
+#include "storage/subsystems.h"
/* entry for buffer lookup hashtable */
typedef struct
@@ -32,37 +33,42 @@ typedef struct
static HTAB *SharedBufHash;
+static void BufTableShmemRequest(void *arg);
-/*
- * Estimate space needed for mapping hashtable
- * size is the desired hash table size (possibly more than NBuffers)
- */
-Size
-BufTableShmemSize(int size)
-{
- return hash_estimate_size(size, sizeof(BufferLookupEnt));
-}
+const ShmemCallbacks BufTableShmemCallbacks = {
+ .request_fn = BufTableShmemRequest,
+ /* no special initialization needed, the hash table will start empty */
+};
/*
- * Initialize shmem hash table for mapping buffers
+ * Register shmem hash table for mapping buffers.
* size is the desired hash table size (possibly more than NBuffers)
*/
void
-InitBufTable(int size)
+BufTableShmemRequest(void *arg)
{
- HASHCTL info;
-
- /* assume no locking is needed yet */
-
- /* BufferTag maps to Buffer */
- info.keysize = sizeof(BufferTag);
- info.entrysize = sizeof(BufferLookupEnt);
- info.num_partitions = NUM_BUFFER_PARTITIONS;
-
- SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
- size,
- &info,
- HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE);
+ int size;
+
+ /*
+ * Request the shared buffer lookup hashtable.
+ *
+ * Since we can't tolerate running out of lookup table entries, we must be
+ * sure to specify an adequate table size here. The maximum steady-state
+ * usage is of course NBuffers entries, but BufferAlloc() tries to insert
+ * a new entry before deleting the old. In principle this could be
+ * happening in each partition concurrently, so we could need as many as
+ * NBuffers + NUM_BUFFER_PARTITIONS entries.
+ */
+ size = NBuffers + NUM_BUFFER_PARTITIONS;
+
+ ShmemRequestHash(.name = "Shared Buffer Lookup Table",
+ .nelems = size,
+ .ptr = &SharedBufHash,
+ .hash_info.keysize = sizeof(BufferTag),
+ .hash_info.entrysize = sizeof(BufferLookupEnt),
+ .hash_info.num_partitions = NUM_BUFFER_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION | HASH_FIXED_SIZE,
+ );
}
/*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index b7687836188..fdb5bad7910 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,8 @@
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
@@ -56,6 +58,14 @@ typedef struct
/* Pointers to shared state */
static BufferStrategyControl *StrategyControl = NULL;
+static void StrategyCtlShmemRequest(void *arg);
+static void StrategyCtlShmemInit(void *arg);
+
+const ShmemCallbacks StrategyCtlShmemCallbacks = {
+ .request_fn = StrategyCtlShmemRequest,
+ .init_fn = StrategyCtlShmemInit,
+};
+
/*
* Private (non-shared) state for managing a ring of shared buffers to re-use.
* This is currently the only kind of BufferAccessStrategy object, but someday
@@ -369,80 +379,35 @@ StrategyNotifyBgWriter(int bgwprocno)
/*
- * StrategyShmemSize
- *
- * estimate the size of shared memory used by the freelist-related structures.
- *
- * Note: for somewhat historical reasons, the buffer lookup hashtable size
- * is also determined here.
+ * StrategyCtlShmemRequest -- request shared memory for the buffer
+ * cache replacement strategy.
*/
-Size
-StrategyShmemSize(void)
+static void
+StrategyCtlShmemRequest(void *arg)
{
- Size size = 0;
-
- /* size of lookup hash table ... see comment in StrategyInitialize */
- size = add_size(size, BufTableShmemSize(NBuffers + NUM_BUFFER_PARTITIONS));
-
- /* size of the shared replacement strategy control block */
- size = add_size(size, MAXALIGN(sizeof(BufferStrategyControl)));
-
- return size;
+ ShmemRequestStruct(.name = "Buffer Strategy Status",
+ .size = sizeof(BufferStrategyControl),
+ .ptr = (void **) &StrategyControl
+ );
}
/*
- * StrategyInitialize -- initialize the buffer cache replacement
- * strategy.
- *
- * Assumes: All of the buffers are already built into a linked list.
- * Only called by postmaster and only during initialization.
+ * StrategyCtlShmemInit -- initialize the buffer cache replacement strategy.
*/
-void
-StrategyInitialize(bool init)
+static void
+StrategyCtlShmemInit(void *arg)
{
- bool found;
+ SpinLockInit(&StrategyControl->buffer_strategy_lock);
- /*
- * Initialize the shared buffer lookup hashtable.
- *
- * Since we can't tolerate running out of lookup table entries, we must be
- * sure to specify an adequate table size here. The maximum steady-state
- * usage is of course NBuffers entries, but BufferAlloc() tries to insert
- * a new entry before deleting the old. In principle this could be
- * happening in each partition concurrently, so we could need as many as
- * NBuffers + NUM_BUFFER_PARTITIONS entries.
- */
- InitBufTable(NBuffers + NUM_BUFFER_PARTITIONS);
-
- /*
- * Get or create the shared strategy control block
- */
- StrategyControl = (BufferStrategyControl *)
- ShmemInitStruct("Buffer Strategy Status",
- sizeof(BufferStrategyControl),
- &found);
-
- if (!found)
- {
- /*
- * Only done once, usually in postmaster
- */
- Assert(init);
-
- SpinLockInit(&StrategyControl->buffer_strategy_lock);
+ /* Initialize the clock-sweep pointer */
+ pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
- /* Initialize the clock-sweep pointer */
- pg_atomic_init_u32(&StrategyControl->nextVictimBuffer, 0);
+ /* Clear statistics */
+ StrategyControl->completePasses = 0;
+ pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
- /* Clear statistics */
- StrategyControl->completePasses = 0;
- pg_atomic_init_u32(&StrategyControl->numBufferAllocs, 0);
-
- /* No pending notification */
- StrategyControl->bgwprocno = -1;
- }
- else
- Assert(!init);
+ /* No pending notification */
+ StrategyControl->bgwprocno = -1;
}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index a510c928daa..f64c1d59fa3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -39,7 +39,6 @@
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/aio_subsys.h"
-#include "storage/bufmgr.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
@@ -99,7 +98,6 @@ CalculateShmemSize(void)
size = add_size(size, ShmemGetRequestedSize());
/* legacy subsystems */
- size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
size = add_size(size, XLOGShmemSize());
@@ -263,7 +261,6 @@ CreateOrAttachShmemStructs(void)
XLOGShmemInit();
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
- BufferManagerShmemInit();
/*
* Set up lock manager
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..89615a254a3 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -587,12 +587,7 @@ extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
extern void StrategyNotifyBgWriter(int bgwprocno);
-extern Size StrategyShmemSize(void);
-extern void StrategyInitialize(bool init);
-
/* buf_table.c */
-extern Size BufTableShmemSize(int size);
-extern void InitBufTable(int size);
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
extern int BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa61a39d9e6..6837b35fc6d 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -371,10 +371,6 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
-/* in buf_init.c */
-extern void BufferManagerShmemInit(void);
-extern Size BufferManagerShmemSize(void);
-
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index b438794d46d..d8e11756a61 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -36,6 +36,9 @@ PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
PG_SHMEM_SUBSYSTEM(MultiXactShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
--
2.47.3
[text/x-patch] v12-0013-Convert-all-remaining-subsystems-to-use-the-new-.patch (110.2K, 14-v12-0013-Convert-all-remaining-subsystems-to-use-the-new-.patch)
download | inline diff:
From 1c60393887bca9298aa7e47ccd68f8b1e5acf96f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sun, 5 Apr 2026 20:21:53 +0300
Subject: [PATCH v12 13/13] Convert all remaining subsystems to use the new
shmem allocation API
This removes all remaining uses of ShmemInitStruct() and
ShmemInitHash() from built-in code.
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com
---
src/backend/access/common/syncscan.c | 76 ++++----
src/backend/access/nbtree/nbtutils.c | 54 +++---
src/backend/access/transam/twophase.c | 75 ++++----
src/backend/access/transam/xlog.c | 82 +++++----
src/backend/access/transam/xlogprefetcher.c | 51 +++---
src/backend/access/transam/xlogrecovery.c | 35 ++--
src/backend/access/transam/xlogwait.c | 50 ++---
src/backend/postmaster/autovacuum.c | 79 ++++----
src/backend/postmaster/bgworker.c | 105 +++++------
src/backend/postmaster/checkpointer.c | 56 +++---
src/backend/postmaster/datachecksum_state.c | 41 ++---
src/backend/postmaster/pgarch.c | 43 +++--
src/backend/postmaster/walsummarizer.c | 60 +++---
src/backend/replication/logical/launcher.c | 56 +++---
src/backend/replication/logical/logicalctl.c | 29 ++-
src/backend/replication/logical/origin.c | 59 +++---
src/backend/replication/logical/slotsync.c | 41 +++--
src/backend/replication/slot.c | 64 +++----
src/backend/replication/walreceiverfuncs.c | 51 +++---
src/backend/replication/walsender.c | 59 +++---
src/backend/storage/ipc/ipci.c | 124 +------------
src/backend/storage/lmgr/lock.c | 109 +++++------
src/backend/utils/activity/backend_status.c | 173 +++++++-----------
src/backend/utils/activity/pgstat_shmem.c | 158 ++++++++--------
src/backend/utils/activity/wait_event.c | 83 ++++-----
src/backend/utils/misc/injection_point.c | 57 +++---
src/include/access/nbtree.h | 2 -
src/include/access/syncscan.h | 2 -
src/include/access/twophase.h | 3 -
src/include/access/xlog.h | 2 -
src/include/access/xlogprefetcher.h | 3 -
src/include/access/xlogrecovery.h | 3 -
src/include/access/xlogwait.h | 2 -
src/include/pgstat.h | 4 -
src/include/postmaster/autovacuum.h | 4 -
src/include/postmaster/bgworker_internals.h | 2 -
src/include/postmaster/bgwriter.h | 3 -
src/include/postmaster/datachecksum_state.h | 4 -
src/include/postmaster/pgarch.h | 2 -
src/include/postmaster/walsummarizer.h | 2 -
src/include/replication/logicalctl.h | 2 -
src/include/replication/logicallauncher.h | 3 -
src/include/replication/origin.h | 4 -
src/include/replication/slot.h | 4 -
src/include/replication/slotsync.h | 2 -
src/include/replication/walreceiver.h | 2 -
src/include/replication/walsender.h | 2 -
src/include/storage/lock.h | 2 -
src/include/storage/subsystemlist.h | 27 +++
src/include/utils/backend_status.h | 8 -
src/include/utils/injection_point.h | 3 -
src/include/utils/wait_event.h | 2 -
.../injection_points/injection_points.c | 59 ++----
src/test/modules/test_aio/test_aio.c | 107 +++++------
54 files changed, 927 insertions(+), 1208 deletions(-)
diff --git a/src/backend/access/common/syncscan.c b/src/backend/access/common/syncscan.c
index 6fcfcb0e560..0f9eb167bed 100644
--- a/src/backend/access/common/syncscan.c
+++ b/src/backend/access/common/syncscan.c
@@ -50,6 +50,7 @@
#include "miscadmin.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/rel.h"
@@ -111,6 +112,14 @@ typedef struct ss_scan_locations_t
#define SizeOfScanLocations(N) \
(offsetof(ss_scan_locations_t, items) + (N) * sizeof(ss_lru_item_t))
+static void SyncScanShmemRequest(void *arg);
+static void SyncScanShmemInit(void *arg);
+
+const ShmemCallbacks SyncScanShmemCallbacks = {
+ .request_fn = SyncScanShmemRequest,
+ .init_fn = SyncScanShmemInit,
+};
+
/* Pointer to struct in shared memory */
static ss_scan_locations_t *scan_locations;
@@ -120,58 +129,47 @@ static BlockNumber ss_search(RelFileLocator relfilelocator,
/*
- * SyncScanShmemSize --- report amount of shared memory space needed
+ * SyncScanShmemRequest --- register this module's shared memory
*/
-Size
-SyncScanShmemSize(void)
+static void
+SyncScanShmemRequest(void *arg)
{
- return SizeOfScanLocations(SYNC_SCAN_NELEM);
+ ShmemRequestStruct(.name = "Sync Scan Locations List",
+ .size = SizeOfScanLocations(SYNC_SCAN_NELEM),
+ .ptr = (void **) &scan_locations,
+ );
}
/*
* SyncScanShmemInit --- initialize this module's shared memory
*/
-void
-SyncScanShmemInit(void)
+static void
+SyncScanShmemInit(void *arg)
{
int i;
- bool found;
- scan_locations = (ss_scan_locations_t *)
- ShmemInitStruct("Sync Scan Locations List",
- SizeOfScanLocations(SYNC_SCAN_NELEM),
- &found);
+ scan_locations->head = &scan_locations->items[0];
+ scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
- if (!IsUnderPostmaster)
+ for (i = 0; i < SYNC_SCAN_NELEM; i++)
{
- /* Initialize shared memory area */
- Assert(!found);
-
- scan_locations->head = &scan_locations->items[0];
- scan_locations->tail = &scan_locations->items[SYNC_SCAN_NELEM - 1];
-
- for (i = 0; i < SYNC_SCAN_NELEM; i++)
- {
- ss_lru_item_t *item = &scan_locations->items[i];
-
- /*
- * Initialize all slots with invalid values. As scans are started,
- * these invalid entries will fall off the LRU list and get
- * replaced with real entries.
- */
- item->location.relfilelocator.spcOid = InvalidOid;
- item->location.relfilelocator.dbOid = InvalidOid;
- item->location.relfilelocator.relNumber = InvalidRelFileNumber;
- item->location.location = InvalidBlockNumber;
-
- item->prev = (i > 0) ?
- (&scan_locations->items[i - 1]) : NULL;
- item->next = (i < SYNC_SCAN_NELEM - 1) ?
- (&scan_locations->items[i + 1]) : NULL;
- }
+ ss_lru_item_t *item = &scan_locations->items[i];
+
+ /*
+ * Initialize all slots with invalid values. As scans are started,
+ * these invalid entries will fall off the LRU list and get replaced
+ * with real entries.
+ */
+ item->location.relfilelocator.spcOid = InvalidOid;
+ item->location.relfilelocator.dbOid = InvalidOid;
+ item->location.relfilelocator.relNumber = InvalidRelFileNumber;
+ item->location.location = InvalidBlockNumber;
+
+ item->prev = (i > 0) ?
+ (&scan_locations->items[i - 1]) : NULL;
+ item->next = (i < SYNC_SCAN_NELEM - 1) ?
+ (&scan_locations->items[i + 1]) : NULL;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 732bc750c9e..014faa1622f 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -25,6 +25,7 @@
#include "lib/qunique.h"
#include "miscadmin.h"
#include "storage/lwlock.h"
+#include "storage/subsystems.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -417,6 +418,13 @@ typedef struct BTVacInfo
static BTVacInfo *btvacinfo;
+static void BTreeShmemRequest(void *arg);
+static void BTreeShmemInit(void *arg);
+
+const ShmemCallbacks BTreeShmemCallbacks = {
+ .request_fn = BTreeShmemRequest,
+ .init_fn = BTreeShmemInit,
+};
/*
* _bt_vacuum_cycleid --- get the active vacuum cycle ID for an index,
@@ -553,47 +561,37 @@ _bt_end_vacuum_callback(int code, Datum arg)
}
/*
- * BTreeShmemSize --- report amount of shared memory space needed
+ * BTreeShmemRequest --- register this module's shared memory
*/
-Size
-BTreeShmemSize(void)
+static void
+BTreeShmemRequest(void *arg)
{
Size size;
size = offsetof(BTVacInfo, vacuums);
size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
- return size;
+
+ ShmemRequestStruct(.name = "BTree Vacuum State",
+ .size = size,
+ .ptr = (void **) &btvacinfo,
+ );
}
/*
* BTreeShmemInit --- initialize this module's shared memory
*/
-void
-BTreeShmemInit(void)
+static void
+BTreeShmemInit(void *arg)
{
- bool found;
-
- btvacinfo = (BTVacInfo *) ShmemInitStruct("BTree Vacuum State",
- BTreeShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- /* Initialize shared memory area */
- Assert(!found);
-
- /*
- * It doesn't really matter what the cycle counter starts at, but
- * having it always start the same doesn't seem good. Seed with
- * low-order bits of time() instead.
- */
- btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
+ /*
+ * It doesn't really matter what the cycle counter starts at, but having
+ * it always start the same doesn't seem good. Seed with low-order bits
+ * of time() instead.
+ */
+ btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
- btvacinfo->num_vacuums = 0;
- btvacinfo->max_vacuums = MaxBackends;
- }
- else
- Assert(found);
+ btvacinfo->num_vacuums = 0;
+ btvacinfo->max_vacuums = MaxBackends;
}
bytea *
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ab1cbd67bac..1035e8b3fc7 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -102,6 +102,7 @@
#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/injection_point.h"
#include "utils/memutils.h"
@@ -189,6 +190,14 @@ typedef struct TwoPhaseStateData
static TwoPhaseStateData *TwoPhaseState;
+static void TwoPhaseShmemRequest(void *arg);
+static void TwoPhaseShmemInit(void *arg);
+
+const ShmemCallbacks TwoPhaseShmemCallbacks = {
+ .request_fn = TwoPhaseShmemRequest,
+ .init_fn = TwoPhaseShmemInit,
+};
+
/*
* Global transaction entry currently locked by us, if any. Note that any
* access to the entry pointed to by this variable must be protected by
@@ -234,10 +243,10 @@ static void RemoveTwoPhaseFile(FullTransactionId fxid, bool giveWarning);
static void RecreateTwoPhaseFile(FullTransactionId fxid, void *content, int len);
/*
- * Initialization of shared memory
+ * Register shared memory for two-phase state.
*/
-Size
-TwoPhaseShmemSize(void)
+static void
+TwoPhaseShmemRequest(void *arg)
{
Size size;
@@ -248,46 +257,40 @@ TwoPhaseShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_prepared_xacts,
sizeof(GlobalTransactionData)));
-
- return size;
+ ShmemRequestStruct(.name = "Prepared Transaction Table",
+ .size = size,
+ .ptr = (void **) &TwoPhaseState,
+ );
}
-void
-TwoPhaseShmemInit(void)
+/*
+ * Initialize shared memory for two-phase state.
+ */
+static void
+TwoPhaseShmemInit(void *arg)
{
- bool found;
-
- TwoPhaseState = ShmemInitStruct("Prepared Transaction Table",
- TwoPhaseShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- GlobalTransaction gxacts;
- int i;
+ GlobalTransaction gxacts;
+ int i;
- Assert(!found);
- TwoPhaseState->freeGXacts = NULL;
- TwoPhaseState->numPrepXacts = 0;
+ TwoPhaseState->freeGXacts = NULL;
+ TwoPhaseState->numPrepXacts = 0;
- /*
- * Initialize the linked list of free GlobalTransactionData structs
- */
- gxacts = (GlobalTransaction)
- ((char *) TwoPhaseState +
- MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
- sizeof(GlobalTransaction) * max_prepared_xacts));
- for (i = 0; i < max_prepared_xacts; i++)
- {
- /* insert into linked list */
- gxacts[i].next = TwoPhaseState->freeGXacts;
- TwoPhaseState->freeGXacts = &gxacts[i];
+ /*
+ * Initialize the linked list of free GlobalTransactionData structs
+ */
+ gxacts = (GlobalTransaction)
+ ((char *) TwoPhaseState +
+ MAXALIGN(offsetof(TwoPhaseStateData, prepXacts) +
+ sizeof(GlobalTransaction) * max_prepared_xacts));
+ for (i = 0; i < max_prepared_xacts; i++)
+ {
+ /* insert into linked list */
+ gxacts[i].next = TwoPhaseState->freeGXacts;
+ TwoPhaseState->freeGXacts = &gxacts[i];
- /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
- gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
- }
+ /* associate it with a PGPROC assigned by ProcGlobalShmemInit */
+ gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]);
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e8999bbb61..b82af9a85c0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -96,6 +96,7 @@
#include "storage/procsignal.h"
#include "storage/reinit.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "storage/sync.h"
#include "utils/guc_hooks.h"
#include "utils/guc_tables.h"
@@ -579,8 +580,19 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
/*
* We maintain an image of pg_control in shared memory.
*/
+static ControlFileData *LocalControlFile = NULL;
static ControlFileData *ControlFile = NULL;
+static void XLOGShmemRequest(void *arg);
+static void XLOGShmemInit(void *arg);
+static void XLOGShmemAttach(void *arg);
+
+const ShmemCallbacks XLOGShmemCallbacks = {
+ .request_fn = XLOGShmemRequest,
+ .init_fn = XLOGShmemInit,
+ .attach_fn = XLOGShmemAttach,
+};
+
/*
* Calculate the amount of space left on the page after 'endptr'. Beware
* multiple evaluation!
@@ -5257,7 +5269,8 @@ void
LocalProcessControlFile(bool reset)
{
Assert(reset || ControlFile == NULL);
- ControlFile = palloc_object(ControlFileData);
+ LocalControlFile = palloc_object(ControlFileData);
+ ControlFile = LocalControlFile;
ReadControlFile();
SetLocalDataChecksumState(ControlFile->data_checksum_version);
}
@@ -5274,10 +5287,10 @@ GetActiveWalLevelOnStandby(void)
}
/*
- * Initialization of shared memory for XLOG
+ * Register shared memory for XLOG.
*/
-Size
-XLOGShmemSize(void)
+static void
+XLOGShmemRequest(void *arg)
{
Size size;
@@ -5317,23 +5330,24 @@ XLOGShmemSize(void)
/* and the buffers themselves */
size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));
- /*
- * Note: we don't count ControlFileData, it comes out of the "slop factor"
- * added by CreateSharedMemoryAndSemaphores. This lets us use this
- * routine again below to compute the actual allocation size.
- */
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Ctl",
+ .size = size,
+ .ptr = (void **) &XLogCtl,
+ );
+ ShmemRequestStruct(.name = "Control File",
+ .size = sizeof(ControlFileData),
+ .ptr = (void **) &ControlFile,
+ );
}
-void
-XLOGShmemInit(void)
+/*
+ * XLOGShmemInit - initialize the XLogCtl shared memory area.
+ */
+static void
+XLOGShmemInit(void *arg)
{
- bool foundCFile,
- foundXLog;
char *allocptr;
int i;
- ControlFileData *localControlFile;
#ifdef WAL_DEBUG
@@ -5351,36 +5365,17 @@ XLOGShmemInit(void)
}
#endif
-
- XLogCtl = (XLogCtlData *)
- ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);
-
- localControlFile = ControlFile;
- ControlFile = (ControlFileData *)
- ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);
-
- if (foundCFile || foundXLog)
- {
- /* both should be present or neither */
- Assert(foundCFile && foundXLog);
-
- /* Initialize local copy of WALInsertLocks */
- WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
-
- if (localControlFile)
- pfree(localControlFile);
- return;
- }
memset(XLogCtl, 0, sizeof(XLogCtlData));
/*
* Already have read control file locally, unless in bootstrap mode. Move
* contents into shared memory.
*/
- if (localControlFile)
+ if (LocalControlFile)
{
- memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
- pfree(localControlFile);
+ memcpy(ControlFile, LocalControlFile, sizeof(ControlFileData));
+ pfree(LocalControlFile);
+ LocalControlFile = NULL;
}
/*
@@ -5442,6 +5437,15 @@ XLOGShmemInit(void)
pg_atomic_init_u64(&XLogCtl->unloggedLSN, InvalidXLogRecPtr);
}
+/*
+ * XLOGShmemAttach - re-establish WALInsertLocks pointer after attaching.
+ */
+static void
+XLOGShmemAttach(void *arg)
+{
+ WALInsertLocks = XLogCtl->Insert.WALInsertLocks;
+}
+
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..83a3f97a57c 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -39,6 +39,7 @@
#include "storage/fd.h"
#include "storage/shmem.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/hsearch.h"
@@ -200,6 +201,14 @@ static LsnReadQueueNextStatus XLogPrefetcherNextBlock(uintptr_t pgsr_private,
static XLogPrefetchStats *SharedStats;
+static void XLogPrefetchShmemRequest(void *arg);
+static void XLogPrefetchShmemInit(void *arg);
+
+const ShmemCallbacks XLogPrefetchShmemCallbacks = {
+ .request_fn = XLogPrefetchShmemRequest,
+ .init_fn = XLogPrefetchShmemInit,
+};
+
static inline LsnReadQueue *
lrq_alloc(uint32 max_distance,
uint32 max_inflight,
@@ -292,10 +301,25 @@ lrq_complete_lsn(LsnReadQueue *lrq, XLogRecPtr lsn)
lrq_prefetch(lrq);
}
-size_t
-XLogPrefetchShmemSize(void)
+static void
+XLogPrefetchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "XLogPrefetchStats",
+ .size = sizeof(XLogPrefetchStats),
+ .ptr = (void **) &SharedStats,
+ );
+}
+
+static void
+XLogPrefetchShmemInit(void *arg)
{
- return sizeof(XLogPrefetchStats);
+ pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
+ pg_atomic_init_u64(&SharedStats->prefetch, 0);
+ pg_atomic_init_u64(&SharedStats->hit, 0);
+ pg_atomic_init_u64(&SharedStats->skip_init, 0);
+ pg_atomic_init_u64(&SharedStats->skip_new, 0);
+ pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
+ pg_atomic_init_u64(&SharedStats->skip_rep, 0);
}
/*
@@ -313,27 +337,6 @@ XLogPrefetchResetStats(void)
pg_atomic_write_u64(&SharedStats->skip_rep, 0);
}
-void
-XLogPrefetchShmemInit(void)
-{
- bool found;
-
- SharedStats = (XLogPrefetchStats *)
- ShmemInitStruct("XLogPrefetchStats",
- sizeof(XLogPrefetchStats),
- &found);
-
- if (!found)
- {
- pg_atomic_init_u64(&SharedStats->reset_time, GetCurrentTimestamp());
- pg_atomic_init_u64(&SharedStats->prefetch, 0);
- pg_atomic_init_u64(&SharedStats->hit, 0);
- pg_atomic_init_u64(&SharedStats->skip_init, 0);
- pg_atomic_init_u64(&SharedStats->skip_new, 0);
- pg_atomic_init_u64(&SharedStats->skip_fpw, 0);
- pg_atomic_init_u64(&SharedStats->skip_rep, 0);
- }
-}
/*
* Called when any GUC is changed that affects prefetching.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fd1c36d061d..c236e2b7969 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -58,6 +58,7 @@
#include "storage/pmsignal.h"
#include "storage/procarray.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
@@ -307,6 +308,14 @@ static char *primary_image_masked = NULL;
XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
+static void XLogRecoveryShmemRequest(void *arg);
+static void XLogRecoveryShmemInit(void *arg);
+
+const ShmemCallbacks XLogRecoveryShmemCallbacks = {
+ .request_fn = XLogRecoveryShmemRequest,
+ .init_fn = XLogRecoveryShmemInit,
+};
+
/*
* abortedRecPtr is the start pointer of a broken record at end of WAL when
* recovery completes; missingContrecPtr is the location of the first
@@ -385,28 +394,20 @@ static void SetCurrentChunkStartTime(TimestampTz xtime);
static void SetLatestXTime(TimestampTz xtime);
/*
- * Initialization of shared memory for WAL recovery
+ * Register shared memory for WAL recovery
*/
-Size
-XLogRecoveryShmemSize(void)
+static void
+XLogRecoveryShmemRequest(void *arg)
{
- Size size;
-
- /* XLogRecoveryCtl */
- size = sizeof(XLogRecoveryCtlData);
-
- return size;
+ ShmemRequestStruct(.name = "XLOG Recovery Ctl",
+ .size = sizeof(XLogRecoveryCtlData),
+ .ptr = (void **) &XLogRecoveryCtl,
+ );
}
-void
-XLogRecoveryShmemInit(void)
+static void
+XLogRecoveryShmemInit(void *arg)
{
- bool found;
-
- XLogRecoveryCtl = (XLogRecoveryCtlData *)
- ShmemInitStruct("XLOG Recovery Ctl", XLogRecoveryShmemSize(), &found);
- if (found)
- return;
memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
SpinLockInit(&XLogRecoveryCtl->info_lck);
diff --git a/src/backend/access/transam/xlogwait.c b/src/backend/access/transam/xlogwait.c
index bf4630677b4..2e31c0d67d7 100644
--- a/src/backend/access/transam/xlogwait.c
+++ b/src/backend/access/transam/xlogwait.c
@@ -57,6 +57,7 @@
#include "storage/latch.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "utils/snapmgr.h"
@@ -68,6 +69,14 @@ static int waitlsn_cmp(const pairingheap_node *a, const pairingheap_node *b,
struct WaitLSNState *waitLSNState = NULL;
+static void WaitLSNShmemRequest(void *arg);
+static void WaitLSNShmemInit(void *arg);
+
+const ShmemCallbacks WaitLSNShmemCallbacks = {
+ .request_fn = WaitLSNShmemRequest,
+ .init_fn = WaitLSNShmemInit,
+};
+
/*
* Wait event for each WaitLSNType, used with WaitLatch() to report
* the wait in pg_stat_activity.
@@ -109,41 +118,34 @@ GetCurrentLSNForWaitType(WaitLSNType lsnType)
pg_unreachable();
}
-/* Report the amount of shared memory space needed for WaitLSNState. */
-Size
-WaitLSNShmemSize(void)
+/* Register the shared memory space needed for WaitLSNState. */
+static void
+WaitLSNShmemRequest(void *arg)
{
Size size;
size = offsetof(WaitLSNState, procInfos);
size = add_size(size, mul_size(MaxBackends + NUM_AUXILIARY_PROCS, sizeof(WaitLSNProcInfo)));
- return size;
+ ShmemRequestStruct(.name = "WaitLSNState",
+ .size = size,
+ .ptr = (void **) &waitLSNState,
+ );
}
/* Initialize the WaitLSNState in the shared memory. */
-void
-WaitLSNShmemInit(void)
+static void
+WaitLSNShmemInit(void *arg)
{
- bool found;
-
- waitLSNState = (WaitLSNState *) ShmemInitStruct("WaitLSNState",
- WaitLSNShmemSize(),
- &found);
- if (!found)
+ /* Initialize heaps and tracking */
+ for (int i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
{
- int i;
-
- /* Initialize heaps and tracking */
- for (i = 0; i < WAIT_LSN_TYPE_COUNT; i++)
- {
- pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
- pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
- }
-
- /* Initialize process info array */
- memset(&waitLSNState->procInfos, 0,
- (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
+ pg_atomic_init_u64(&waitLSNState->minWaitedLSN[i], PG_UINT64_MAX);
+ pairingheap_initialize(&waitLSNState->waitersHeap[i], waitlsn_cmp, NULL);
}
+
+ /* Initialize process info array */
+ memset(&waitLSNState->procInfos, 0,
+ (MaxBackends + NUM_AUXILIARY_PROCS) * sizeof(WaitLSNProcInfo));
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8400e6722cc..250c43b85e5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -98,6 +98,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
@@ -309,6 +310,14 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;
+static void AutoVacuumShmemRequest(void *arg);
+static void AutoVacuumShmemInit(void *arg);
+
+const ShmemCallbacks AutoVacuumShmemCallbacks = {
+ .request_fn = AutoVacuumShmemRequest,
+ .init_fn = AutoVacuumShmemInit,
+};
+
/*
* the database list (of avl_dbase elements) in the launcher, and the context
* that contains it
@@ -3545,11 +3554,11 @@ autovac_init(void)
}
/*
- * AutoVacuumShmemSize
- * Compute space needed for autovacuum-related shared memory
+ * AutoVacuumShmemRequest
+ * Register shared memory space needed for autovacuum
*/
-Size
-AutoVacuumShmemSize(void)
+static void
+AutoVacuumShmemRequest(void *arg)
{
Size size;
@@ -3560,53 +3569,41 @@ AutoVacuumShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(autovacuum_worker_slots,
sizeof(WorkerInfoData)));
- return size;
+
+ ShmemRequestStruct(.name = "AutoVacuum Data",
+ .size = size,
+ .ptr = (void **) &AutoVacuumShmem,
+ );
}
/*
* AutoVacuumShmemInit
- * Allocate and initialize autovacuum-related shared memory
+ * Initialize autovacuum-related shared memory
*/
-void
-AutoVacuumShmemInit(void)
+static void
+AutoVacuumShmemInit(void *arg)
{
- bool found;
-
- AutoVacuumShmem = (AutoVacuumShmemStruct *)
- ShmemInitStruct("AutoVacuum Data",
- AutoVacuumShmemSize(),
- &found);
-
- if (!IsUnderPostmaster)
- {
- WorkerInfo worker;
- int i;
+ WorkerInfo worker;
- Assert(!found);
-
- AutoVacuumShmem->av_launcherpid = 0;
- dclist_init(&AutoVacuumShmem->av_freeWorkers);
- dlist_init(&AutoVacuumShmem->av_runningWorkers);
- AutoVacuumShmem->av_startingWorker = NULL;
- memset(AutoVacuumShmem->av_workItems, 0,
- sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
-
- worker = (WorkerInfo) ((char *) AutoVacuumShmem +
- MAXALIGN(sizeof(AutoVacuumShmemStruct)));
-
- /* initialize the WorkerInfo free list */
- for (i = 0; i < autovacuum_worker_slots; i++)
- {
- dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
- &worker[i].wi_links);
- pg_atomic_init_flag(&worker[i].wi_dobalance);
- }
+ AutoVacuumShmem->av_launcherpid = 0;
+ dclist_init(&AutoVacuumShmem->av_freeWorkers);
+ dlist_init(&AutoVacuumShmem->av_runningWorkers);
+ AutoVacuumShmem->av_startingWorker = NULL;
+ memset(AutoVacuumShmem->av_workItems, 0,
+ sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
- pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+ worker = (WorkerInfo) ((char *) AutoVacuumShmem +
+ MAXALIGN(sizeof(AutoVacuumShmemStruct)));
+ /* initialize the WorkerInfo free list */
+ for (int i = 0; i < autovacuum_worker_slots; i++)
+ {
+ dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+ &worker[i].wi_links);
+ pg_atomic_init_flag(&worker[i].wi_dobalance);
}
- else
- Assert(found);
+
+ pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
}
/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 536aff7ca05..0992b9b6353 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -30,6 +30,7 @@
#include "storage/procarray.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
#include "utils/memutils.h"
@@ -110,6 +111,14 @@ struct BackgroundWorkerHandle
static BackgroundWorkerArray *BackgroundWorkerData;
+static void BackgroundWorkerShmemRequest(void *arg);
+static void BackgroundWorkerShmemInit(void *arg);
+
+const ShmemCallbacks BackgroundWorkerShmemCallbacks = {
+ .request_fn = BackgroundWorkerShmemRequest,
+ .init_fn = BackgroundWorkerShmemInit,
+};
+
/*
* List of internal background worker entry points. We need this for
* reasons explained in LookupBackgroundWorkerFunction(), below.
@@ -160,10 +169,10 @@ static bgworker_main_type LookupBackgroundWorkerFunction(const char *libraryname
/*
- * Calculate shared memory needed.
+ * Register shared memory needed for background workers.
*/
-Size
-BackgroundWorkerShmemSize(void)
+static void
+BackgroundWorkerShmemRequest(void *arg)
{
Size size;
@@ -171,66 +180,58 @@ BackgroundWorkerShmemSize(void)
size = offsetof(BackgroundWorkerArray, slot);
size = add_size(size, mul_size(max_worker_processes,
sizeof(BackgroundWorkerSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "Background Worker Data",
+ .size = size,
+ .ptr = (void **) &BackgroundWorkerData,
+ );
}
/*
- * Initialize shared memory.
+ * Initialize shared memory for background workers.
*/
-void
-BackgroundWorkerShmemInit(void)
+static void
+BackgroundWorkerShmemInit(void *arg)
{
- bool found;
-
- BackgroundWorkerData = ShmemInitStruct("Background Worker Data",
- BackgroundWorkerShmemSize(),
- &found);
- if (!IsUnderPostmaster)
- {
- dlist_iter iter;
- int slotno = 0;
+ dlist_iter iter;
+ int slotno = 0;
- BackgroundWorkerData->total_slots = max_worker_processes;
- BackgroundWorkerData->parallel_register_count = 0;
- BackgroundWorkerData->parallel_terminate_count = 0;
+ BackgroundWorkerData->total_slots = max_worker_processes;
+ BackgroundWorkerData->parallel_register_count = 0;
+ BackgroundWorkerData->parallel_terminate_count = 0;
- /*
- * Copy contents of worker list into shared memory. Record the shared
- * memory slot assigned to each worker. This ensures a 1-to-1
- * correspondence between the postmaster's private list and the array
- * in shared memory.
- */
- dlist_foreach(iter, &BackgroundWorkerList)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- RegisteredBgWorker *rw;
+ /*
+ * Copy contents of worker list into shared memory. Record the shared
+ * memory slot assigned to each worker. This ensures a 1-to-1
+ * correspondence between the postmaster's private list and the array in
+ * shared memory.
+ */
+ dlist_foreach(iter, &BackgroundWorkerList)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ RegisteredBgWorker *rw;
- rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
- Assert(slotno < max_worker_processes);
- slot->in_use = true;
- slot->terminate = false;
- slot->pid = InvalidPid;
- slot->generation = 0;
- rw->rw_shmem_slot = slotno;
- rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
- memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
- ++slotno;
- }
+ rw = dlist_container(RegisteredBgWorker, rw_lnode, iter.cur);
+ Assert(slotno < max_worker_processes);
+ slot->in_use = true;
+ slot->terminate = false;
+ slot->pid = InvalidPid;
+ slot->generation = 0;
+ rw->rw_shmem_slot = slotno;
+ rw->rw_worker.bgw_notify_pid = 0; /* might be reinit after crash */
+ memcpy(&slot->worker, &rw->rw_worker, sizeof(BackgroundWorker));
+ ++slotno;
+ }
- /*
- * Mark any remaining slots as not in use.
- */
- while (slotno < max_worker_processes)
- {
- BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
+ /*
+ * Mark any remaining slots as not in use.
+ */
+ while (slotno < max_worker_processes)
+ {
+ BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
- slot->in_use = false;
- ++slotno;
- }
+ slot->in_use = false;
+ ++slotno;
}
- else
- Assert(found);
}
/*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3c982c6ffac..6b424ee610f 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -63,6 +63,7 @@
#include "storage/shmem.h"
#include "storage/smgr.h"
#include "storage/spin.h"
+#include "storage/subsystems.h"
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -143,6 +144,14 @@ typedef struct
static CheckpointerShmemStruct *CheckpointerShmem;
+static void CheckpointerShmemRequest(void *arg);
+static void CheckpointerShmemInit(void *arg);
+
+const ShmemCallbacks CheckpointerShmemCallbacks = {
+ .request_fn = CheckpointerShmemRequest,
+ .init_fn = CheckpointerShmemInit,
+};
+
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
#define WRITES_PER_ABSORB 1000
@@ -950,11 +959,11 @@ ReqShutdownXLOG(SIGNAL_ARGS)
*/
/*
- * CheckpointerShmemSize
- * Compute space needed for checkpointer-related shared memory
+ * CheckpointerShmemRequest
+ * Register shared memory space needed for checkpointer
*/
-Size
-CheckpointerShmemSize(void)
+static void
+CheckpointerShmemRequest(void *arg)
{
Size size;
@@ -967,39 +976,24 @@ CheckpointerShmemSize(void)
size = add_size(size, mul_size(Min(NBuffers,
MAX_CHECKPOINT_REQUESTS),
sizeof(CheckpointerRequest)));
-
- return size;
+ ShmemRequestStruct(.name = "Checkpointer Data",
+ .size = size,
+ .ptr = (void **) &CheckpointerShmem,
+ );
}
/*
* CheckpointerShmemInit
- * Allocate and initialize checkpointer-related shared memory
+ * Initialize checkpointer-related shared memory
*/
-void
-CheckpointerShmemInit(void)
+static void
+CheckpointerShmemInit(void *arg)
{
- Size size = CheckpointerShmemSize();
- bool found;
-
- CheckpointerShmem = (CheckpointerShmemStruct *)
- ShmemInitStruct("Checkpointer Data",
- size,
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. Note that we zero the whole
- * requests array; this is so that CompactCheckpointerRequestQueue can
- * assume that any pad bytes in the request structs are zeroes.
- */
- MemSet(CheckpointerShmem, 0, size);
- SpinLockInit(&CheckpointerShmem->ckpt_lck);
- CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
- CheckpointerShmem->head = CheckpointerShmem->tail = 0;
- ConditionVariableInit(&CheckpointerShmem->start_cv);
- ConditionVariableInit(&CheckpointerShmem->done_cv);
- }
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+ CheckpointerShmem->max_requests = Min(NBuffers, MAX_CHECKPOINT_REQUESTS);
+ CheckpointerShmem->head = CheckpointerShmem->tail = 0;
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
/*
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index 76004bcedc6..eb7b01d0993 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -211,6 +211,7 @@
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/smgr.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -346,6 +347,7 @@ static volatile sig_atomic_t launcher_running = false;
static DataChecksumsWorkerOperation operation;
/* Prototypes */
+static void DataChecksumsShmemRequest(void *arg);
static bool DatabaseExists(Oid dboid);
static List *BuildDatabaseList(void);
static List *BuildRelationList(bool temp_relations, bool include_shared);
@@ -356,6 +358,10 @@ static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferA
static void launcher_cancel_handler(SIGNAL_ARGS);
static void WaitForAllTransactionsToFinish(void);
+const ShmemCallbacks DataChecksumsShmemCallbacks = {
+ .request_fn = DataChecksumsShmemRequest,
+};
+
/*****************************************************************************
* Functionality for manipulating the data checksum state in the cluster
*/
@@ -1236,35 +1242,16 @@ ProcessAllDatabases(void)
}
/*
- * DataChecksumStateSize
- * Compute required space for datachecksumsworker-related shared memory
- */
-Size
-DataChecksumsShmemSize(void)
-{
- Size size;
-
- size = sizeof(DataChecksumsStateStruct);
- size = MAXALIGN(size);
-
- return size;
-}
-
-/*
- * DataChecksumStateInit
- * Allocate and initialize datachecksumsworker-related shared memory
+ * DataChecksumShmemRequest
+ * Request datachecksumsworker-related shared memory
*/
-void
-DataChecksumsShmemInit(void)
+static void
+DataChecksumsShmemRequest(void *arg)
{
- bool found;
-
- DataChecksumState = (DataChecksumsStateStruct *)
- ShmemInitStruct("DataChecksumsWorker Data",
- DataChecksumsShmemSize(),
- &found);
- if (!found)
- MemSet(DataChecksumState, 0, DataChecksumsShmemSize());
+ ShmemRequestStruct(.name = "DataChecksumsWorker Data",
+ .size = sizeof(DataChecksumsStateStruct),
+ .ptr = (void **) &DataChecksumState,
+ );
}
/*
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index fa4bdfe9ab9..0a1a1149d78 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -48,6 +48,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -154,33 +155,31 @@ static int ready_file_comparator(Datum a, Datum b, void *arg);
static void LoadArchiveLibrary(void);
static void pgarch_call_module_shutdown_cb(int code, Datum arg);
-/* Report shared memory space needed by PgArchShmemInit */
-Size
-PgArchShmemSize(void)
-{
- Size size = 0;
+static void PgArchShmemRequest(void *arg);
+static void PgArchShmemInit(void *arg);
- size = add_size(size, sizeof(PgArchData));
+const ShmemCallbacks PgArchShmemCallbacks = {
+ .request_fn = PgArchShmemRequest,
+ .init_fn = PgArchShmemInit,
+};
- return size;
+/* Register shared memory space needed by the archiver */
+static void
+PgArchShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Archiver Data",
+ .size = sizeof(PgArchData),
+ .ptr = (void **) &PgArch,
+ );
}
-/* Allocate and initialize archiver-related shared memory */
-void
-PgArchShmemInit(void)
+/* Initialize archiver-related shared memory */
+static void
+PgArchShmemInit(void *arg)
{
- bool found;
-
- PgArch = (PgArchData *)
- ShmemInitStruct("Archiver Data", PgArchShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(PgArch, 0, PgArchShmemSize());
- PgArch->pgprocno = INVALID_PROC_NUMBER;
- pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
- }
+ MemSet(PgArch, 0, sizeof(PgArchData));
+ PgArch->pgprocno = INVALID_PROC_NUMBER;
+ pg_atomic_init_u32(&PgArch->force_dir_scan, 0);
}
/*
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index a37b3018abf..20960f5b633 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -47,6 +47,7 @@
#include "storage/proc.h"
#include "storage/procsignal.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/wait_event.h"
@@ -109,6 +110,14 @@ typedef struct
/* Pointer to shared memory state. */
static WalSummarizerData *WalSummarizerCtl;
+static void WalSummarizerShmemRequest(void *arg);
+static void WalSummarizerShmemInit(void *arg);
+
+const ShmemCallbacks WalSummarizerShmemCallbacks = {
+ .request_fn = WalSummarizerShmemRequest,
+ .init_fn = WalSummarizerShmemInit,
+};
+
/*
* When we reach end of WAL and need to read more, we sleep for a number of
* milliseconds that is an integer multiple of MS_PER_SLEEP_QUANTUM. This is
@@ -168,43 +177,34 @@ static void summarizer_wait_for_wal(void);
static void MaybeRemoveOldWalSummaries(void);
/*
- * Amount of shared memory required for this module.
+ * Register shared memory space needed by this module.
*/
-Size
-WalSummarizerShmemSize(void)
+static void
+WalSummarizerShmemRequest(void *arg)
{
- return sizeof(WalSummarizerData);
+ ShmemRequestStruct(.name = "Wal Summarizer Ctl",
+ .size = sizeof(WalSummarizerData),
+ .ptr = (void **) &WalSummarizerCtl,
+ );
}
/*
- * Create or attach to shared memory segment for this module.
+ * Initialize shared memory for this module.
*/
-void
-WalSummarizerShmemInit(void)
+static void
+WalSummarizerShmemInit(void *arg)
{
- bool found;
-
- WalSummarizerCtl = (WalSummarizerData *)
- ShmemInitStruct("Wal Summarizer Ctl", WalSummarizerShmemSize(),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize.
- *
- * We're just filling in dummy values here -- the real initialization
- * will happen when GetOldestUnsummarizedLSN() is called for the first
- * time.
- */
- WalSummarizerCtl->initialized = false;
- WalSummarizerCtl->summarized_tli = 0;
- WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
- WalSummarizerCtl->lsn_is_exact = false;
- WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
- WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
- ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
- }
+ /*
+ * We're just filling in dummy values here -- the real initialization will
+ * happen when GetOldestUnsummarizedLSN() is called for the first time.
+ */
+ WalSummarizerCtl->initialized = false;
+ WalSummarizerCtl->summarized_tli = 0;
+ WalSummarizerCtl->summarized_lsn = InvalidXLogRecPtr;
+ WalSummarizerCtl->lsn_is_exact = false;
+ WalSummarizerCtl->summarizer_pgprocno = INVALID_PROC_NUMBER;
+ WalSummarizerCtl->pending_lsn = InvalidXLogRecPtr;
+ ConditionVariableInit(&WalSummarizerCtl->summary_file_cv);
}
/*
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 09964198550..9e75a3e04ee 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -38,6 +38,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -71,6 +72,14 @@ typedef struct LogicalRepCtxStruct
static LogicalRepCtxStruct *LogicalRepCtx;
+static void ApplyLauncherShmemRequest(void *arg);
+static void ApplyLauncherShmemInit(void *arg);
+
+const ShmemCallbacks ApplyLauncherShmemCallbacks = {
+ .request_fn = ApplyLauncherShmemRequest,
+ .init_fn = ApplyLauncherShmemInit,
+};
+
/* an entry in the last-start-times shared hash table */
typedef struct LauncherLastStartTimesEntry
{
@@ -972,11 +981,11 @@ logicalrep_pa_worker_count(Oid subid)
}
/*
- * ApplyLauncherShmemSize
- * Compute space needed for replication launcher shared memory
+ * ApplyLauncherShmemRequest
+ * Register shared memory space needed for replication launcher
*/
-Size
-ApplyLauncherShmemSize(void)
+static void
+ApplyLauncherShmemRequest(void *arg)
{
Size size;
@@ -987,7 +996,10 @@ ApplyLauncherShmemSize(void)
size = MAXALIGN(size);
size = add_size(size, mul_size(max_logical_replication_workers,
sizeof(LogicalRepWorker)));
- return size;
+ ShmemRequestStruct(.name = "Logical Replication Launcher Data",
+ .size = size,
+ .ptr = (void **) &LogicalRepCtx,
+ );
}
/*
@@ -1028,35 +1040,23 @@ ApplyLauncherRegister(void)
/*
* ApplyLauncherShmemInit
- * Allocate and initialize replication launcher shared memory
+ * Initialize replication launcher shared memory
*/
-void
-ApplyLauncherShmemInit(void)
+static void
+ApplyLauncherShmemInit(void *arg)
{
- bool found;
+ int slot;
- LogicalRepCtx = (LogicalRepCtxStruct *)
- ShmemInitStruct("Logical Replication Launcher Data",
- ApplyLauncherShmemSize(),
- &found);
+ LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
+ LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
- if (!found)
+ /* Initialize memory and spin locks for each worker slot. */
+ for (slot = 0; slot < max_logical_replication_workers; slot++)
{
- int slot;
-
- memset(LogicalRepCtx, 0, ApplyLauncherShmemSize());
-
- LogicalRepCtx->last_start_dsa = DSA_HANDLE_INVALID;
- LogicalRepCtx->last_start_dsh = DSHASH_HANDLE_INVALID;
+ LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
- /* Initialize memory and spin locks for each worker slot. */
- for (slot = 0; slot < max_logical_replication_workers; slot++)
- {
- LogicalRepWorker *worker = &LogicalRepCtx->workers[slot];
-
- memset(worker, 0, sizeof(LogicalRepWorker));
- SpinLockInit(&worker->relmutex);
- }
+ memset(worker, 0, sizeof(LogicalRepWorker));
+ SpinLockInit(&worker->relmutex);
}
}
diff --git a/src/backend/replication/logical/logicalctl.c b/src/backend/replication/logical/logicalctl.c
index 4e292951201..72f68ec58ef 100644
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -72,6 +72,7 @@
#include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/procsignal.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
/*
@@ -98,6 +99,12 @@ typedef struct LogicalDecodingCtlData
static LogicalDecodingCtlData *LogicalDecodingCtl = NULL;
+static void LogicalDecodingCtlShmemRequest(void *arg);
+
+const ShmemCallbacks LogicalDecodingCtlShmemCallbacks = {
+ .request_fn = LogicalDecodingCtlShmemRequest,
+};
+
/*
* A process-local cache of LogicalDecodingCtl->xlog_logical_info. This is
* initialized at process startup, and updated when processing the process
@@ -120,23 +127,13 @@ static void update_xlog_logical_info(void);
static void abort_logical_decoding_activation(int code, Datum arg);
static void write_logical_decoding_status_update_record(bool status);
-Size
-LogicalDecodingCtlShmemSize(void)
-{
- return sizeof(LogicalDecodingCtlData);
-}
-
-void
-LogicalDecodingCtlShmemInit(void)
+static void
+LogicalDecodingCtlShmemRequest(void *arg)
{
- bool found;
-
- LogicalDecodingCtl = ShmemInitStruct("Logical decoding control",
- LogicalDecodingCtlShmemSize(),
- &found);
-
- if (!found)
- MemSet(LogicalDecodingCtl, 0, LogicalDecodingCtlShmemSize());
+ ShmemRequestStruct(.name = "Logical decoding control",
+ .size = sizeof(LogicalDecodingCtlData),
+ .ptr = (void **) &LogicalDecodingCtl,
+ );
}
/*
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 661d68ad653..372d77c475e 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -88,6 +88,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
@@ -176,6 +177,16 @@ ReplOriginXactState replorigin_xact_state = {
*/
static ReplicationState *replication_states;
+static void ReplicationOriginShmemRequest(void *arg);
+static void ReplicationOriginShmemInit(void *arg);
+static void ReplicationOriginShmemAttach(void *arg);
+
+const ShmemCallbacks ReplicationOriginShmemCallbacks = {
+ .request_fn = ReplicationOriginShmemRequest,
+ .init_fn = ReplicationOriginShmemInit,
+ .attach_fn = ReplicationOriginShmemAttach,
+};
+
/*
* Actual shared memory block (replication_states[] is now part of this).
*/
@@ -539,50 +550,48 @@ replorigin_by_oid(ReplOriginId roident, bool missing_ok, char **roname)
* ---------------------------------------------------------------------------
*/
-Size
-ReplicationOriginShmemSize(void)
+static void
+ReplicationOriginShmemRequest(void *arg)
{
Size size = 0;
if (max_active_replication_origins == 0)
- return size;
+ return;
size = add_size(size, offsetof(ReplicationStateCtl, states));
-
size = add_size(size,
mul_size(max_active_replication_origins, sizeof(ReplicationState)));
- return size;
+ ShmemRequestStruct(.name = "ReplicationOriginState",
+ .size = size,
+ .ptr = (void **) &replication_states_ctl,
+ );
}
-void
-ReplicationOriginShmemInit(void)
+static void
+ReplicationOriginShmemInit(void *arg)
{
- bool found;
-
if (max_active_replication_origins == 0)
return;
- replication_states_ctl = (ReplicationStateCtl *)
- ShmemInitStruct("ReplicationOriginState",
- ReplicationOriginShmemSize(),
- &found);
replication_states = replication_states_ctl->states;
- if (!found)
- {
- int i;
+ replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
- MemSet(replication_states_ctl, 0, ReplicationOriginShmemSize());
+ for (int i = 0; i < max_active_replication_origins; i++)
+ {
+ LWLockInitialize(&replication_states[i].lock,
+ replication_states_ctl->tranche_id);
+ ConditionVariableInit(&replication_states[i].origin_cv);
+ }
+}
- replication_states_ctl->tranche_id = LWTRANCHE_REPLICATION_ORIGIN_STATE;
+static void
+ReplicationOriginShmemAttach(void *arg)
+{
+ if (max_active_replication_origins == 0)
+ return;
- for (i = 0; i < max_active_replication_origins; i++)
- {
- LWLockInitialize(&replication_states[i].lock,
- replication_states_ctl->tranche_id);
- ConditionVariableInit(&replication_states[i].origin_cv);
- }
- }
+ replication_states = replication_states_ctl->states;
}
/* ---------------------------------------------------------------------------
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e75db69e3f6..d615ff8a81c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -73,6 +73,7 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
@@ -118,6 +119,14 @@ typedef struct SlotSyncCtxStruct
static SlotSyncCtxStruct *SlotSyncCtx = NULL;
+static void SlotSyncShmemRequest(void *arg);
+static void SlotSyncShmemInit(void *arg);
+
+const ShmemCallbacks SlotSyncShmemCallbacks = {
+ .request_fn = SlotSyncShmemRequest,
+ .init_fn = SlotSyncShmemInit,
+};
+
/* GUC variable */
bool sync_replication_slots = false;
@@ -1828,32 +1837,26 @@ IsSyncingReplicationSlots(void)
}
/*
- * Amount of shared memory required for slot synchronization.
+ * Register shared memory space needed for slot synchronization.
*/
-Size
-SlotSyncShmemSize(void)
+static void
+SlotSyncShmemRequest(void *arg)
{
- return sizeof(SlotSyncCtxStruct);
+ ShmemRequestStruct(.name = "Slot Sync Data",
+ .size = sizeof(SlotSyncCtxStruct),
+ .ptr = (void **) &SlotSyncCtx,
+ );
}
/*
- * Allocate and initialize the shared memory of slot synchronization.
+ * Initialize shared memory for slot synchronization.
*/
-void
-SlotSyncShmemInit(void)
+static void
+SlotSyncShmemInit(void *arg)
{
- Size size = SlotSyncShmemSize();
- bool found;
-
- SlotSyncCtx = (SlotSyncCtxStruct *)
- ShmemInitStruct("Slot Sync Data", size, &found);
-
- if (!found)
- {
- memset(SlotSyncCtx, 0, size);
- SlotSyncCtx->pid = InvalidPid;
- SpinLockInit(&SlotSyncCtx->mutex);
- }
+ memset(SlotSyncCtx, 0, sizeof(SlotSyncCtxStruct));
+ SlotSyncCtx->pid = InvalidPid;
+ SpinLockInit(&SlotSyncCtx->mutex);
}
/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..21a213a0ebf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
#include "storage/ipc.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "utils/builtins.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
@@ -145,6 +146,14 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
/* Control array for replication slot management */
ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
+static void ReplicationSlotsShmemRequest(void *arg);
+static void ReplicationSlotsShmemInit(void *arg);
+
+const ShmemCallbacks ReplicationSlotsShmemCallbacks = {
+ .request_fn = ReplicationSlotsShmemRequest,
+ .init_fn = ReplicationSlotsShmemInit,
+};
+
/* My backend's replication slot in the shared memory array */
ReplicationSlot *MyReplicationSlot = NULL;
@@ -183,56 +192,41 @@ static void CreateSlotOnDisk(ReplicationSlot *slot);
static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
/*
- * Report shared-memory space needed by ReplicationSlotsShmemInit.
+ * Register shared memory space needed for replication slots.
*/
-Size
-ReplicationSlotsShmemSize(void)
+static void
+ReplicationSlotsShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
if (max_replication_slots == 0)
- return size;
+ return;
size = offsetof(ReplicationSlotCtlData, replication_slots);
size = add_size(size,
mul_size(max_replication_slots, sizeof(ReplicationSlot)));
-
- return size;
+ ShmemRequestStruct(.name = "ReplicationSlot Ctl",
+ .size = size,
+ .ptr = (void **) &ReplicationSlotCtl,
+ );
}
/*
- * Allocate and initialize shared memory for replication slots.
+ * Initialize shared memory for replication slots.
*/
-void
-ReplicationSlotsShmemInit(void)
+static void
+ReplicationSlotsShmemInit(void *arg)
{
- bool found;
-
- if (max_replication_slots == 0)
- return;
-
- ReplicationSlotCtl = (ReplicationSlotCtlData *)
- ShmemInitStruct("ReplicationSlot Ctl", ReplicationSlotsShmemSize(),
- &found);
-
- if (!found)
+ for (int i = 0; i < max_replication_slots; i++)
{
- int i;
+ ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
- /* First time through, so initialize */
- MemSet(ReplicationSlotCtl, 0, ReplicationSlotsShmemSize());
-
- for (i = 0; i < max_replication_slots; i++)
- {
- ReplicationSlot *slot = &ReplicationSlotCtl->replication_slots[i];
-
- /* everything else is zeroed by the memset above */
- slot->active_proc = INVALID_PROC_NUMBER;
- SpinLockInit(&slot->mutex);
- LWLockInitialize(&slot->io_in_progress_lock,
- LWTRANCHE_REPLICATION_SLOT_IO);
- ConditionVariableInit(&slot->active_cv);
- }
+ /* everything else is zeroed by the memset above */
+ slot->active_proc = INVALID_PROC_NUMBER;
+ SpinLockInit(&slot->mutex);
+ LWLockInitialize(&slot->io_in_progress_lock,
+ LWTRANCHE_REPLICATION_SLOT_IO);
+ ConditionVariableInit(&slot->active_cv);
}
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 45b9d4f09f2..4e03e721872 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -29,47 +29,46 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/timestamp.h"
#include "utils/wait_event.h"
WalRcvData *WalRcv = NULL;
+static void WalRcvShmemRequest(void *arg);
+static void WalRcvShmemInit(void *arg);
+
+const ShmemCallbacks WalRcvShmemCallbacks = {
+ .request_fn = WalRcvShmemRequest,
+ .init_fn = WalRcvShmemInit,
+};
+
/*
* How long to wait for walreceiver to start up after requesting
* postmaster to launch it. In seconds.
*/
#define WALRCV_STARTUP_TIMEOUT 10
-/* Report shared memory space needed by WalRcvShmemInit */
-Size
-WalRcvShmemSize(void)
+/* Register shared memory space needed by walreceiver */
+static void
+WalRcvShmemRequest(void *arg)
{
- Size size = 0;
-
- size = add_size(size, sizeof(WalRcvData));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Receiver Ctl",
+ .size = sizeof(WalRcvData),
+ .ptr = (void **) &WalRcv,
+ );
}
-/* Allocate and initialize walreceiver-related shared memory */
-void
-WalRcvShmemInit(void)
+/* Initialize walreceiver-related shared memory */
+static void
+WalRcvShmemInit(void *arg)
{
- bool found;
-
- WalRcv = (WalRcvData *)
- ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
-
- if (!found)
- {
- /* First time through, so initialize */
- MemSet(WalRcv, 0, WalRcvShmemSize());
- WalRcv->walRcvState = WALRCV_STOPPED;
- ConditionVariableInit(&WalRcv->walRcvStoppedCV);
- SpinLockInit(&WalRcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
- WalRcv->procno = INVALID_PROC_NUMBER;
- }
+ MemSet(WalRcv, 0, sizeof(WalRcvData));
+ WalRcv->walRcvState = WALRCV_STOPPED;
+ ConditionVariableInit(&WalRcv->walRcvStoppedCV);
+ SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ WalRcv->procno = INVALID_PROC_NUMBER;
}
/* Is walreceiver running (or starting up)? */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2bb3f34dc6d..ec39942bfc1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -86,6 +86,7 @@
#include "storage/pmsignal.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/subsystems.h"
#include "tcop/dest.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
@@ -117,6 +118,14 @@
/* Array of WalSnds in shared memory */
WalSndCtlData *WalSndCtl = NULL;
+static void WalSndShmemRequest(void *arg);
+static void WalSndShmemInit(void *arg);
+
+const ShmemCallbacks WalSndShmemCallbacks = {
+ .request_fn = WalSndShmemRequest,
+ .init_fn = WalSndShmemInit,
+};
+
/* My slot in the shared memory array */
WalSnd *MyWalSnd = NULL;
@@ -3765,47 +3774,37 @@ WalSndSignals(void)
pqsignal(SIGCHLD, SIG_DFL);
}
-/* Report shared-memory space needed by WalSndShmemInit */
-Size
-WalSndShmemSize(void)
+/* Register shared-memory space needed by walsender */
+static void
+WalSndShmemRequest(void *arg)
{
- Size size = 0;
+ Size size;
size = offsetof(WalSndCtlData, walsnds);
size = add_size(size, mul_size(max_wal_senders, sizeof(WalSnd)));
-
- return size;
+ ShmemRequestStruct(.name = "Wal Sender Ctl",
+ .size = size,
+ .ptr = (void **) &WalSndCtl,
+ );
}
-/* Allocate and initialize walsender-related shared memory */
-void
-WalSndShmemInit(void)
+/* Initialize walsender-related shared memory */
+static void
+WalSndShmemInit(void *arg)
{
- bool found;
- int i;
+ for (int i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ dlist_init(&(WalSndCtl->SyncRepQueue[i]));
- WalSndCtl = (WalSndCtlData *)
- ShmemInitStruct("Wal Sender Ctl", WalSndShmemSize(), &found);
-
- if (!found)
+ for (int i = 0; i < max_wal_senders; i++)
{
- /* First time through, so initialize */
- MemSet(WalSndCtl, 0, WalSndShmemSize());
-
- for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
- dlist_init(&(WalSndCtl->SyncRepQueue[i]));
-
- for (i = 0; i < max_wal_senders; i++)
- {
- WalSnd *walsnd = &WalSndCtl->walsnds[i];
-
- SpinLockInit(&walsnd->mutex);
- }
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
- ConditionVariableInit(&WalSndCtl->wal_flush_cv);
- ConditionVariableInit(&WalSndCtl->wal_replay_cv);
- ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
+ SpinLockInit(&walsnd->mutex);
}
+
+ ConditionVariableInit(&WalSndCtl->wal_flush_cv);
+ ConditionVariableInit(&WalSndCtl->wal_replay_cv);
+ ConditionVariableInit(&WalSndCtl->wal_confirm_rcv_cv);
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index f64c1d59fa3..bf6b81e621b 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -14,41 +14,16 @@
*/
#include "postgres.h"
-#include "access/clog.h"
-#include "access/commit_ts.h"
-#include "access/multixact.h"
-#include "access/nbtree.h"
-#include "access/subtrans.h"
-#include "access/syncscan.h"
-#include "access/twophase.h"
-#include "access/xlogprefetcher.h"
-#include "access/xlogrecovery.h"
-#include "access/xlogwait.h"
-#include "commands/async.h"
#include "miscadmin.h"
#include "pgstat.h"
-#include "postmaster/autovacuum.h"
-#include "postmaster/bgworker_internals.h"
-#include "postmaster/bgwriter.h"
-#include "postmaster/datachecksum_state.h"
-#include "postmaster/walsummarizer.h"
-#include "replication/logicallauncher.h"
-#include "replication/origin.h"
-#include "replication/slot.h"
-#include "replication/slotsync.h"
-#include "replication/walreceiver.h"
-#include "replication/walsender.h"
-#include "storage/aio_subsys.h"
#include "storage/dsm.h"
#include "storage/ipc.h"
+#include "storage/lock.h"
#include "storage/pg_shmem.h"
-#include "storage/predicate.h"
#include "storage/proc.h"
#include "storage/shmem_internal.h"
#include "storage/subsystems.h"
#include "utils/guc.h"
-#include "utils/injection_point.h"
-#include "utils/wait_event.h"
/* GUCs */
int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
@@ -57,8 +32,6 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
static Size total_addin_request = 0;
-static void CreateOrAttachShmemStructs(void);
-
/*
* RequestAddinShmemSpace
* Request that extra shmem space be allocated for use by
@@ -97,33 +70,6 @@ CalculateShmemSize(void)
size = 100000;
size = add_size(size, ShmemGetRequestedSize());
- /* legacy subsystems */
- size = add_size(size, LockManagerShmemSize());
- size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, XLOGShmemSize());
- size = add_size(size, XLogRecoveryShmemSize());
- size = add_size(size, TwoPhaseShmemSize());
- size = add_size(size, BackgroundWorkerShmemSize());
- size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, CheckpointerShmemSize());
- size = add_size(size, AutoVacuumShmemSize());
- size = add_size(size, ReplicationSlotsShmemSize());
- size = add_size(size, ReplicationOriginShmemSize());
- size = add_size(size, WalSndShmemSize());
- size = add_size(size, WalRcvShmemSize());
- size = add_size(size, WalSummarizerShmemSize());
- size = add_size(size, PgArchShmemSize());
- size = add_size(size, ApplyLauncherShmemSize());
- size = add_size(size, BTreeShmemSize());
- size = add_size(size, SyncScanShmemSize());
- size = add_size(size, StatsShmemSize());
- size = add_size(size, WaitEventCustomShmemSize());
- size = add_size(size, InjectionPointShmemSize());
- size = add_size(size, SlotSyncShmemSize());
- size = add_size(size, WaitLSNShmemSize());
- size = add_size(size, LogicalDecodingCtlShmemSize());
- size = add_size(size, DataChecksumsShmemSize());
-
/* include additional requested shmem from preload libraries */
size = add_size(size, total_addin_request);
@@ -157,7 +103,6 @@ AttachSharedMemoryStructs(void)
/* Establish pointers to all shared memory areas in this backend */
ShmemAttachRequested();
- CreateOrAttachShmemStructs();
/*
* Now give loadable modules a chance to set up their shmem allocations
@@ -204,9 +149,6 @@ CreateSharedMemoryAndSemaphores(void)
/* Initialize all shmem areas */
ShmemInitRequested();
- /* Initialize legacy subsystems */
- CreateOrAttachShmemStructs();
-
/* Initialize dynamic shared memory facilities. */
dsm_postmaster_startup(shim);
@@ -237,70 +179,6 @@ RegisterBuiltinShmemCallbacks(void)
#undef PG_SHMEM_SUBSYSTEM
}
-/*
- * Initialize various subsystems, setting up their data structures in
- * shared memory.
- *
- * This is called by the postmaster or by a standalone backend.
- * It is also called by a backend forked from the postmaster in the
- * EXEC_BACKEND case. In the latter case, the shared memory segment
- * already exists and has been physically attached to, but we have to
- * initialize pointers in local memory that reference the shared structures,
- * because we didn't inherit the correct pointer values from the postmaster
- * as we do in the fork() scenario. The easiest way to do that is to run
- * through the same code as before. (Note that the called routines mostly
- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
- * This is a bit code-wasteful and could be cleaned up.)
- */
-static void
-CreateOrAttachShmemStructs(void)
-{
- /*
- * Set up xlog, clog, and buffers
- */
- XLOGShmemInit();
- XLogPrefetchShmemInit();
- XLogRecoveryShmemInit();
-
- /*
- * Set up lock manager
- */
- LockManagerShmemInit();
-
- /*
- * Set up process table
- */
- BackendStatusShmemInit();
- TwoPhaseShmemInit();
- BackgroundWorkerShmemInit();
-
- /*
- * Set up interprocess signaling mechanisms
- */
- CheckpointerShmemInit();
- AutoVacuumShmemInit();
- ReplicationSlotsShmemInit();
- ReplicationOriginShmemInit();
- WalSndShmemInit();
- WalRcvShmemInit();
- WalSummarizerShmemInit();
- PgArchShmemInit();
- ApplyLauncherShmemInit();
- SlotSyncShmemInit();
- DataChecksumsShmemInit();
-
- /*
- * Set up other modules that need some shared memory space
- */
- BTreeShmemInit();
- SyncScanShmemInit();
- StatsShmemInit();
- WaitEventCustomShmemInit();
- InjectionPointShmemInit();
- WaitLSNShmemInit();
- LogicalDecodingCtlShmemInit();
-}
-
/*
* InitializeShmemGUCs
*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 798c453ab38..c221fe96889 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -43,8 +43,10 @@
#include "storage/lmgr.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/shmem.h"
#include "storage/spin.h"
#include "storage/standby.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/resowner.h"
@@ -312,6 +314,14 @@ typedef struct
static volatile FastPathStrongRelationLockData *FastPathStrongRelationLocks;
+static void LockManagerShmemRequest(void *arg);
+static void LockManagerShmemInit(void *arg);
+
+const ShmemCallbacks LockManagerShmemCallbacks = {
+ .request_fn = LockManagerShmemRequest,
+ .init_fn = LockManagerShmemInit,
+};
+
/*
* Pointers to hash tables containing lock state
@@ -432,21 +442,15 @@ static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
/*
- * Initialize the lock manager's shmem data structures.
+ * Register the lock manager's shmem data structures.
*
- * This is called from CreateSharedMemoryAndSemaphores(), which see for more
- * comments. In the normal postmaster case, the shared hash tables are
- * created here, and backends inherit pointers to them via fork(). In the
- * EXEC_BACKEND case, each backend re-executes this code to obtain pointers to
- * the already existing shared hash tables. In either case, each backend must
- * also call InitLockManagerAccess() to create the locallock hash table.
+ * In addition to this, each backend must also call InitLockManagerAccess() to
+ * create the locallock hash table.
*/
-void
-LockManagerShmemInit(void)
+static void
+LockManagerShmemRequest(void *arg)
{
- HASHCTL info;
int64 max_table_size;
- bool found;
/*
* Compute sizes for lock hashtables. Note that these calculations must
@@ -455,45 +459,41 @@ LockManagerShmemInit(void)
max_table_size = NLOCKENTS();
/*
- * Allocate hash table for LOCK structs. This stores per-locked-object
+ * Hash table for LOCK structs. This stores per-locked-object
* information.
*/
- info.keysize = sizeof(LOCKTAG);
- info.entrysize = sizeof(LOCK);
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodLockHash = ShmemInitHash("LOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_BLOBS |
- HASH_PARTITION | HASH_FIXED_SIZE);
+ ShmemRequestHash(.name = "LOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodLockHash,
+ .hash_info.keysize = sizeof(LOCKTAG),
+ .hash_info.entrysize = sizeof(LOCK),
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_BLOBS | HASH_PARTITION,
+ );
/* Assume an average of 2 holders per lock */
max_table_size *= 2;
- /*
- * Allocate hash table for PROCLOCK structs. This stores
- * per-lock-per-holder information.
- */
- info.keysize = sizeof(PROCLOCKTAG);
- info.entrysize = sizeof(PROCLOCK);
- info.hash = proclock_hash;
- info.num_partitions = NUM_LOCK_PARTITIONS;
-
- LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
- max_table_size,
- &info,
- HASH_ELEM | HASH_FUNCTION |
- HASH_FIXED_SIZE | HASH_PARTITION);
+ ShmemRequestHash(.name = "PROCLOCK hash",
+ .nelems = max_table_size,
+ .ptr = &LockMethodProcLockHash,
+ .hash_info.keysize = sizeof(PROCLOCKTAG),
+ .hash_info.entrysize = sizeof(PROCLOCK),
+ .hash_info.hash = proclock_hash,
+ .hash_info.num_partitions = NUM_LOCK_PARTITIONS,
+ .hash_flags = HASH_ELEM | HASH_FUNCTION | HASH_PARTITION,
+ );
+
+ ShmemRequestStruct(.name = "Fast Path Strong Relation Lock Data",
+ .size = sizeof(FastPathStrongRelationLockData),
+ .ptr = (void **) (void *) &FastPathStrongRelationLocks,
+ );
+}
- /*
- * Allocate fast-path structures.
- */
- FastPathStrongRelationLocks =
- ShmemInitStruct("Fast Path Strong Relation Lock Data",
- sizeof(FastPathStrongRelationLockData), &found);
- if (!found)
- SpinLockInit(&FastPathStrongRelationLocks->mutex);
+static void
+LockManagerShmemInit(void *arg)
+{
+ SpinLockInit(&FastPathStrongRelationLocks->mutex);
}
/*
@@ -3758,29 +3758,6 @@ PostPrepare_Locks(FullTransactionId fxid)
}
-/*
- * Estimate shared-memory space used for lock tables
- */
-Size
-LockManagerShmemSize(void)
-{
- Size size = 0;
- long max_table_size;
-
- /* lock hash table */
- max_table_size = NLOCKENTS();
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(LOCK)));
-
- /* proclock hash table */
- max_table_size *= 2;
- size = add_size(size, hash_estimate_size(max_table_size, sizeof(PROCLOCK)));
-
- /* fast-path structures */
- size = add_size(size, sizeof(FastPathStrongRelationLockData));
-
- return size;
-}
-
/*
* GetLockStatusData - Return a summary of the lock manager's internal
* status, for use in a user-level reporting function.
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index cd087129469..d685fc5cd87 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -19,6 +19,8 @@
#include "storage/ipc.h"
#include "storage/proc.h" /* for MyProc */
#include "storage/procarray.h"
+#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/ascii.h"
#include "utils/guc.h" /* for application_name */
#include "utils/memutils.h"
@@ -73,133 +75,97 @@ static void pgstat_beshutdown_hook(int code, Datum arg);
static void pgstat_read_current_status(void);
static void pgstat_setup_backend_status_context(void);
+static void BackendStatusShmemRequest(void *arg);
+static void BackendStatusShmemInit(void *arg);
+static void BackendStatusShmemAttach(void *arg);
+
+const ShmemCallbacks BackendStatusShmemCallbacks = {
+ .request_fn = BackendStatusShmemRequest,
+ .init_fn = BackendStatusShmemInit,
+ .attach_fn = BackendStatusShmemAttach,
+};
/*
- * Report shared-memory space needed by BackendStatusShmemInit.
+ * Register shared memory needs for backend status reporting.
*/
-Size
-BackendStatusShmemSize(void)
+static void
+BackendStatusShmemRequest(void *arg)
{
- Size size;
-
- /* BackendStatusArray: */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- /* BackendAppnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendClientHostnameBuffer: */
- size = add_size(size,
- mul_size(NAMEDATALEN, NumBackendStatSlots));
- /* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend Status Array",
+ .size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendStatusArray,
+ );
+
+ ShmemRequestStruct(.name = "Backend Application Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendAppnameBuffer,
+ );
+
+ ShmemRequestStruct(.name = "Backend Client Host Name Buffer",
+ .size = mul_size(NAMEDATALEN, NumBackendStatSlots),
+ .ptr = (void **) &BackendClientHostnameBuffer,
+ );
+
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ ShmemRequestStruct(.name = "Backend Activity Buffer",
+ .size = BackendActivityBufferSize,
+ .ptr = (void **) &BackendActivityBuffer
+ );
+
#ifdef USE_SSL
- /* BackendSslStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend SSL Status Buffer",
+ .size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendSslStatusBuffer,
+ );
#endif
+
#ifdef ENABLE_GSS
- /* BackendGssStatusBuffer: */
- size = add_size(size,
- mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots));
+ ShmemRequestStruct(.name = "Backend GSS Status Buffer",
+ .size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots),
+ .ptr = (void **) &BackendGssStatusBuffer,
+ );
#endif
- return size;
}
/*
* Initialize the shared status array and several string buffers
* during postmaster startup.
*/
-void
-BackendStatusShmemInit(void)
+static void
+BackendStatusShmemInit(void *arg)
{
- Size size;
- bool found;
int i;
char *buffer;
- /* Create or attach to the shared array */
- size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
- BackendStatusArray = (PgBackendStatus *)
- ShmemInitStruct("Backend Status Array", size, &found);
-
- if (!found)
+ /* Initialize st_appname pointers. */
+ buffer = BackendAppnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- /*
- * We're the first - initialize.
- */
- MemSet(BackendStatusArray, 0, size);
- }
-
- /* Create or attach to the shared appname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendAppnameBuffer = (char *)
- ShmemInitStruct("Backend Application Name Buffer", size, &found);
-
- if (!found)
- {
- MemSet(BackendAppnameBuffer, 0, size);
-
- /* Initialize st_appname pointers. */
- buffer = BackendAppnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_appname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_appname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared client hostname buffer */
- size = mul_size(NAMEDATALEN, NumBackendStatSlots);
- BackendClientHostnameBuffer = (char *)
- ShmemInitStruct("Backend Client Host Name Buffer", size, &found);
-
- if (!found)
+ /* Initialize st_clienthostname pointers. */
+ buffer = BackendClientHostnameBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendClientHostnameBuffer, 0, size);
-
- /* Initialize st_clienthostname pointers. */
- buffer = BackendClientHostnameBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_clienthostname = buffer;
- buffer += NAMEDATALEN;
- }
+ BackendStatusArray[i].st_clienthostname = buffer;
+ buffer += NAMEDATALEN;
}
- /* Create or attach to the shared activity buffer */
- BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
- NumBackendStatSlots);
- BackendActivityBuffer = (char *)
- ShmemInitStruct("Backend Activity Buffer",
- BackendActivityBufferSize,
- &found);
-
- if (!found)
+ /* Initialize st_activity pointers. */
+ buffer = BackendActivityBuffer;
+ for (i = 0; i < NumBackendStatSlots; i++)
{
- MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
-
- /* Initialize st_activity pointers. */
- buffer = BackendActivityBuffer;
- for (i = 0; i < NumBackendStatSlots; i++)
- {
- BackendStatusArray[i].st_activity_raw = buffer;
- buffer += pgstat_track_activity_query_size;
- }
+ BackendStatusArray[i].st_activity_raw = buffer;
+ buffer += pgstat_track_activity_query_size;
}
#ifdef USE_SSL
- /* Create or attach to the shared SSL status buffer */
- size = mul_size(sizeof(PgBackendSSLStatus), NumBackendStatSlots);
- BackendSslStatusBuffer = (PgBackendSSLStatus *)
- ShmemInitStruct("Backend SSL Status Buffer", size, &found);
-
- if (!found)
{
PgBackendSSLStatus *ptr;
- MemSet(BackendSslStatusBuffer, 0, size);
-
/* Initialize st_sslstatus pointers. */
ptr = BackendSslStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -211,17 +177,9 @@ BackendStatusShmemInit(void)
#endif
#ifdef ENABLE_GSS
- /* Create or attach to the shared GSSAPI status buffer */
- size = mul_size(sizeof(PgBackendGSSStatus), NumBackendStatSlots);
- BackendGssStatusBuffer = (PgBackendGSSStatus *)
- ShmemInitStruct("Backend GSS Status Buffer", size, &found);
-
- if (!found)
{
PgBackendGSSStatus *ptr;
- MemSet(BackendGssStatusBuffer, 0, size);
-
/* Initialize st_gssstatus pointers. */
ptr = BackendGssStatusBuffer;
for (i = 0; i < NumBackendStatSlots; i++)
@@ -233,6 +191,13 @@ BackendStatusShmemInit(void)
#endif
}
+static void
+BackendStatusShmemAttach(void *arg)
+{
+ BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+}
+
/*
* Initialize pgstats backend activity state, and set up our on-proc-exit
* hook. Called from InitPostgres and AuxiliaryProcessMain. MyProcNumber must
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index 33fbdca9609..955faf5ebc7 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -14,6 +14,7 @@
#include "pgstat.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
@@ -57,6 +58,13 @@ static void pgstat_release_matching_entry_refs(bool discard_pending, ReleaseMatc
static void pgstat_setup_memcxt(void);
+static void StatsShmemRequest(void *arg);
+static void StatsShmemInit(void *arg);
+
+const ShmemCallbacks StatsShmemCallbacks = {
+ .request_fn = StatsShmemRequest,
+ .init_fn = StatsShmemInit,
+};
/* parameter for the shared hash */
static const dshash_parameters dsh_params = {
@@ -123,7 +131,7 @@ pgstat_dsa_init_size(void)
/*
* Compute shared memory space needed for cumulative statistics
*/
-Size
+static Size
StatsShmemSize(void)
{
Size sz;
@@ -149,102 +157,98 @@ StatsShmemSize(void)
return sz;
}
+/*
+ * Register shared memory area for cumulative statistics
+ */
+static void
+StatsShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "Shared Memory Stats",
+ .size = StatsShmemSize(),
+ .ptr = (void **) &pgStatLocal.shmem,
+ );
+}
+
/*
* Initialize cumulative statistics system during startup
*/
-void
-StatsShmemInit(void)
+static void
+StatsShmemInit(void *arg)
{
- bool found;
- Size sz;
+ dsa_area *dsa;
+ dshash_table *dsh;
+ PgStat_ShmemControl *ctl = pgStatLocal.shmem;
+ char *p = (char *) ctl;
- sz = StatsShmemSize();
- pgStatLocal.shmem = (PgStat_ShmemControl *)
- ShmemInitStruct("Shared Memory Stats", sz, &found);
+ /* the allocation of pgStatLocal.shmem itself */
+ p += MAXALIGN(sizeof(PgStat_ShmemControl));
- if (!IsUnderPostmaster)
- {
- dsa_area *dsa;
- dshash_table *dsh;
- PgStat_ShmemControl *ctl = pgStatLocal.shmem;
- char *p = (char *) ctl;
+ /*
+ * Create a small dsa allocation in plain shared memory. This is required
+ * because postmaster cannot use dsm segments. It also provides a small
+ * efficiency win.
+ */
+ ctl->raw_dsa_area = p;
+ dsa = dsa_create_in_place(ctl->raw_dsa_area,
+ pgstat_dsa_init_size(),
+ LWTRANCHE_PGSTATS_DSA, NULL);
+ dsa_pin(dsa);
- Assert(!found);
+ /*
+ * To ensure dshash is created in "plain" shared memory, temporarily limit
+ * size of dsa to the initial size of the dsa.
+ */
+ dsa_set_size_limit(dsa, pgstat_dsa_init_size());
- /* the allocation of pgStatLocal.shmem itself */
- p += MAXALIGN(sizeof(PgStat_ShmemControl));
+ /*
+ * With the limit in place, create the dshash table. XXX: It'd be nice if
+ * there were dshash_create_in_place().
+ */
+ dsh = dshash_create(dsa, &dsh_params, NULL);
+ ctl->hash_handle = dshash_get_hash_table_handle(dsh);
- /*
- * Create a small dsa allocation in plain shared memory. This is
- * required because postmaster cannot use dsm segments. It also
- * provides a small efficiency win.
- */
- ctl->raw_dsa_area = p;
- dsa = dsa_create_in_place(ctl->raw_dsa_area,
- pgstat_dsa_init_size(),
- LWTRANCHE_PGSTATS_DSA, NULL);
- dsa_pin(dsa);
+ /* lift limit set above */
+ dsa_set_size_limit(dsa, -1);
- /*
- * To ensure dshash is created in "plain" shared memory, temporarily
- * limit size of dsa to the initial size of the dsa.
- */
- dsa_set_size_limit(dsa, pgstat_dsa_init_size());
+ /*
+ * Postmaster will never access these again, thus free the local
+ * dsa/dshash references.
+ */
+ dshash_detach(dsh);
+ dsa_detach(dsa);
- /*
- * With the limit in place, create the dshash table. XXX: It'd be nice
- * if there were dshash_create_in_place().
- */
- dsh = dshash_create(dsa, &dsh_params, NULL);
- ctl->hash_handle = dshash_get_hash_table_handle(dsh);
+ pg_atomic_init_u64(&ctl->gc_request_count, 1);
- /* lift limit set above */
- dsa_set_size_limit(dsa, -1);
+ /* Do the per-kind initialization */
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ char *ptr;
- /*
- * Postmaster will never access these again, thus free the local
- * dsa/dshash references.
- */
- dshash_detach(dsh);
- dsa_detach(dsa);
+ if (!kind_info)
+ continue;
- pg_atomic_init_u64(&ctl->gc_request_count, 1);
+ /* initialize entry count tracking */
+ if (kind_info->track_entry_count)
+ pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
- /* Do the per-kind initialization */
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ /* initialize fixed-numbered stats */
+ if (kind_info->fixed_amount)
{
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
- char *ptr;
-
- if (!kind_info)
- continue;
-
- /* initialize entry count tracking */
- if (kind_info->track_entry_count)
- pg_atomic_init_u64(&ctl->entry_counts[kind - 1], 0);
-
- /* initialize fixed-numbered stats */
- if (kind_info->fixed_amount)
+ if (pgstat_is_kind_builtin(kind))
+ ptr = ((char *) ctl) + kind_info->shared_ctl_off;
+ else
{
- if (pgstat_is_kind_builtin(kind))
- ptr = ((char *) ctl) + kind_info->shared_ctl_off;
- else
- {
- int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
-
- Assert(kind_info->shared_size != 0);
- ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
- ptr = ctl->custom_data[idx];
- }
-
- kind_info->init_shmem_cb(ptr);
+ int idx = kind - PGSTAT_KIND_CUSTOM_MIN;
+
+ Assert(kind_info->shared_size != 0);
+ ctl->custom_data[idx] = ShmemAlloc(kind_info->shared_size);
+ ptr = ctl->custom_data[idx];
}
+
+ kind_info->init_shmem_cb(ptr);
}
}
- else
- {
- Assert(found);
- }
}
void
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 2b76967776c..95635c7f56c 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -25,6 +25,7 @@
#include "storage/lmgr.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
+#include "storage/subsystems.h"
#include "storage/spin.h"
#include "utils/wait_event.h"
@@ -95,59 +96,47 @@ static WaitEventCustomCounterData *WaitEventCustomCounter;
static uint32 WaitEventCustomNew(uint32 classId, const char *wait_event_name);
static const char *GetWaitEventCustomIdentifier(uint32 wait_event_info);
+static void WaitEventCustomShmemRequest(void *arg);
+static void WaitEventCustomShmemInit(void *arg);
+
+const ShmemCallbacks WaitEventCustomShmemCallbacks = {
+ .request_fn = WaitEventCustomShmemRequest,
+ .init_fn = WaitEventCustomShmemInit,
+};
+
/*
- * Return the space for dynamic shared hash tables and dynamic allocation counter.
+ * Register shmem space for dynamic shared hash and dynamic allocation counter.
*/
-Size
-WaitEventCustomShmemSize(void)
+static void
+WaitEventCustomShmemRequest(void *arg)
{
- Size sz;
-
- sz = MAXALIGN(sizeof(WaitEventCustomCounterData));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByInfo)));
- sz = add_size(sz, hash_estimate_size(WAIT_EVENT_CUSTOM_HASH_SIZE,
- sizeof(WaitEventCustomEntryByName)));
- return sz;
+ ShmemRequestStruct(.name = "WaitEventCustomCounterData",
+ .size = sizeof(WaitEventCustomCounterData),
+ .ptr = (void **) &WaitEventCustomCounter,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by wait event information",
+ .ptr = &WaitEventCustomHashByInfo,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ .hash_info.keysize = sizeof(uint32),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByInfo),
+ .hash_flags = HASH_ELEM | HASH_BLOBS,
+ );
+ ShmemRequestHash(.name = "WaitEventCustom hash by name",
+ .ptr = &WaitEventCustomHashByName,
+ .nelems = WAIT_EVENT_CUSTOM_HASH_SIZE,
+ /* key is a NULL-terminated string */
+ .hash_info.keysize = sizeof(char[NAMEDATALEN]),
+ .hash_info.entrysize = sizeof(WaitEventCustomEntryByName),
+ .hash_flags = HASH_ELEM | HASH_STRINGS,
+ );
}
-/*
- * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
- */
-void
-WaitEventCustomShmemInit(void)
+static void
+WaitEventCustomShmemInit(void *arg)
{
- bool found;
- HASHCTL info;
-
- WaitEventCustomCounter = (WaitEventCustomCounterData *)
- ShmemInitStruct("WaitEventCustomCounterData",
- sizeof(WaitEventCustomCounterData), &found);
-
- if (!found)
- {
- /* initialize the allocation counter and its spinlock. */
- WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
- SpinLockInit(&WaitEventCustomCounter->mutex);
- }
-
- /* initialize or attach the hash tables to store custom wait events */
- info.keysize = sizeof(uint32);
- info.entrysize = sizeof(WaitEventCustomEntryByInfo);
- WaitEventCustomHashByInfo =
- ShmemInitHash("WaitEventCustom hash by wait event information",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- /* key is a NULL-terminated string */
- info.keysize = sizeof(char[NAMEDATALEN]);
- info.entrysize = sizeof(WaitEventCustomEntryByName);
- WaitEventCustomHashByName =
- ShmemInitHash("WaitEventCustom hash by name",
- WAIT_EVENT_CUSTOM_HASH_SIZE,
- &info,
- HASH_ELEM | HASH_STRINGS);
+ /* initialize the allocation counter and its spinlock. */
+ WaitEventCustomCounter->nextId = WAIT_EVENT_CUSTOM_INITIAL_ID;
+ SpinLockInit(&WaitEventCustomCounter->mutex);
}
/*
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index c06b0e9b800..a7c99e097ea 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -17,6 +17,7 @@
*/
#include "postgres.h"
+#include "storage/subsystems.h"
#include "utils/injection_point.h"
#ifdef USE_INJECTION_POINTS
@@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
static HTAB *InjectionPointCache = NULL;
+#ifdef USE_INJECTION_POINTS
+static void InjectionPointShmemRequest(void *arg);
+static void InjectionPointShmemInit(void *arg);
+#endif
+
/*
* injection_point_cache_add
*
@@ -226,45 +232,34 @@ injection_point_cache_get(const char *name)
}
#endif /* USE_INJECTION_POINTS */
-/*
- * Return the space for dynamic shared hash table.
- */
-Size
-InjectionPointShmemSize(void)
-{
+const ShmemCallbacks InjectionPointShmemCallbacks = {
#ifdef USE_INJECTION_POINTS
- Size sz = 0;
-
- sz = add_size(sz, sizeof(InjectionPointsCtl));
- return sz;
-#else
- return 0;
+ .request_fn = InjectionPointShmemRequest,
+ .init_fn = InjectionPointShmemInit,
#endif
-}
+};
/*
- * Allocate shmem space for dynamic shared hash.
+ * Reserve space for the dynamic shared hash table
*/
-void
-InjectionPointShmemInit(void)
-{
#ifdef USE_INJECTION_POINTS
- bool found;
+static void
+InjectionPointShmemRequest(void *arg)
+{
+ ShmemRequestStruct(.name = "InjectionPoint hash",
+ .size = sizeof(InjectionPointsCtl),
+ .ptr = (void **) &ActiveInjectionPoints,
+ );
+}
- ActiveInjectionPoints = ShmemInitStruct("InjectionPoint hash",
- sizeof(InjectionPointsCtl),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
- for (int i = 0; i < MAX_INJECTION_POINTS; i++)
- pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
- }
- else
- Assert(found);
-#endif
+static void
+InjectionPointShmemInit(void *arg)
+{
+ pg_atomic_init_u32(&ActiveInjectionPoints->max_inuse, 0);
+ for (int i = 0; i < MAX_INJECTION_POINTS; i++)
+ pg_atomic_init_u64(&ActiveInjectionPoints->entries[i].generation, 0);
}
+#endif
/*
* Attach a new injection point.
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index da7503c57b6..3097e9bb1af 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1300,8 +1300,6 @@ extern BTCycleId _bt_vacuum_cycleid(Relation rel);
extern BTCycleId _bt_start_vacuum(Relation rel);
extern void _bt_end_vacuum(Relation rel);
extern void _bt_end_vacuum_callback(int code, Datum arg);
-extern Size BTreeShmemSize(void);
-extern void BTreeShmemInit(void);
extern bytea *btoptions(Datum reloptions, bool validate);
extern bool btproperty(Oid index_oid, int attno,
IndexAMProperty prop, const char *propname,
diff --git a/src/include/access/syncscan.h b/src/include/access/syncscan.h
index 24cf33294e5..32f8332aaee 100644
--- a/src/include/access/syncscan.h
+++ b/src/include/access/syncscan.h
@@ -24,7 +24,5 @@ extern PGDLLIMPORT bool trace_syncscan;
extern void ss_report_location(Relation rel, BlockNumber location);
extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
-extern void SyncScanShmemInit(void);
-extern Size SyncScanShmemSize(void);
#endif
diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
index 761d56a5f3d..1d2ff42c9b7 100644
--- a/src/include/access/twophase.h
+++ b/src/include/access/twophase.h
@@ -33,9 +33,6 @@ typedef struct GlobalTransactionData *GlobalTransaction;
/* GUC variable */
extern PGDLLIMPORT int max_prepared_xacts;
-extern Size TwoPhaseShmemSize(void);
-extern void TwoPhaseShmemInit(void);
-
extern void AtAbort_Twophase(void);
extern void PostPrepare_Twophase(void);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 4af38e74ce4..437b4f32349 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,8 +259,6 @@ extern void InitLocalDataChecksumState(void);
extern void SetLocalDataChecksumState(uint32 data_checksum_version);
extern bool GetDefaultCharSignedness(void);
extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
-extern Size XLOGShmemSize(void);
-extern void XLOGShmemInit(void);
extern void BootStrapXLOG(uint32 data_checksum_version);
extern void InitializeWalConsistencyChecking(void);
extern void LocalProcessControlFile(bool reset);
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..56a81676d92 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -34,9 +34,6 @@ typedef struct XLogPrefetcher XLogPrefetcher;
extern void XLogPrefetchReconfigure(void);
-extern size_t XLogPrefetchShmemSize(void);
-extern void XLogPrefetchShmemInit(void);
-
extern void XLogPrefetchResetStats(void);
extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 2842106b285..ba7750dca0b 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -153,9 +153,6 @@ extern PGDLLIMPORT bool reachedConsistency;
/* Are we currently in standby mode? */
extern PGDLLIMPORT bool StandbyMode;
-extern Size XLogRecoveryShmemSize(void);
-extern void XLogRecoveryShmemInit(void);
-
extern void InitWalRecovery(ControlFileData *ControlFile,
bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
bool *haveTblspcMap_ptr);
diff --git a/src/include/access/xlogwait.h b/src/include/access/xlogwait.h
index d12531d32b8..07157f220ea 100644
--- a/src/include/access/xlogwait.h
+++ b/src/include/access/xlogwait.h
@@ -100,8 +100,6 @@ typedef struct WaitLSNState
extern PGDLLIMPORT WaitLSNState *waitLSNState;
-extern Size WaitLSNShmemSize(void);
-extern void WaitLSNShmemInit(void);
extern XLogRecPtr GetCurrentLSNForWaitType(WaitLSNType lsnType);
extern void WaitLSNWakeup(WaitLSNType lsnType, XLogRecPtr currentLSN);
extern void WaitLSNCleanup(void);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 8e3549c3752..2786a7c5ffb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -541,10 +541,6 @@ typedef struct PgStat_BackendPending
* Functions in pgstat.c
*/
-/* functions called from postmaster */
-extern Size StatsShmemSize(void);
-extern void StatsShmemInit(void);
-
/* Functions called during server startup / shutdown */
extern void pgstat_restore_stats(void);
extern void pgstat_discard_stats(void);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index b21d111d4d5..8954f6b28ee 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,8 +66,4 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
-/* shared memory stuff */
-extern Size AutoVacuumShmemSize(void);
-extern void AutoVacuumShmemInit(void);
-
#endif /* AUTOVACUUM_H */
diff --git a/src/include/postmaster/bgworker_internals.h b/src/include/postmaster/bgworker_internals.h
index b789caf4034..b6261bc01df 100644
--- a/src/include/postmaster/bgworker_internals.h
+++ b/src/include/postmaster/bgworker_internals.h
@@ -41,8 +41,6 @@ typedef struct RegisteredBgWorker
extern PGDLLIMPORT dlist_head BackgroundWorkerList;
-extern Size BackgroundWorkerShmemSize(void);
-extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(RegisteredBgWorker *rw);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *rw);
diff --git a/src/include/postmaster/bgwriter.h b/src/include/postmaster/bgwriter.h
index 47470cba893..36eea0b1ab0 100644
--- a/src/include/postmaster/bgwriter.h
+++ b/src/include/postmaster/bgwriter.h
@@ -39,9 +39,6 @@ extern bool ForwardSyncRequest(const FileTag *ftag, SyncRequestType type);
extern void AbsorbSyncRequests(void);
-extern Size CheckpointerShmemSize(void);
-extern void CheckpointerShmemInit(void);
-
extern bool FirstCallSinceLastCheckpoint(void);
#endif /* _BGWRITER_H */
diff --git a/src/include/postmaster/datachecksum_state.h b/src/include/postmaster/datachecksum_state.h
index 343494edcc8..05625539604 100644
--- a/src/include/postmaster/datachecksum_state.h
+++ b/src/include/postmaster/datachecksum_state.h
@@ -17,10 +17,6 @@
#include "storage/procsignal.h"
-/* Shared memory */
-extern Size DataChecksumsShmemSize(void);
-extern void DataChecksumsShmemInit(void);
-
/* Possible operations the Datachecksumsworker can perform */
typedef enum DataChecksumsWorkerOperation
{
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index faa7609cd81..9772bb573a1 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -26,8 +26,6 @@
#define MAX_XFN_CHARS 40
#define VALID_XFN_CHARS "0123456789ABCDEF.history.backup.partial"
-extern Size PgArchShmemSize(void);
-extern void PgArchShmemInit(void);
extern bool PgArchCanRestart(void);
pg_noreturn extern void PgArchiverMain(const void *startup_data, size_t startup_data_len);
extern void PgArchWakeup(void);
diff --git a/src/include/postmaster/walsummarizer.h b/src/include/postmaster/walsummarizer.h
index a4c055066b4..b9a755fadbc 100644
--- a/src/include/postmaster/walsummarizer.h
+++ b/src/include/postmaster/walsummarizer.h
@@ -19,8 +19,6 @@
extern PGDLLIMPORT bool summarize_wal;
extern PGDLLIMPORT int wal_summary_keep_time;
-extern Size WalSummarizerShmemSize(void);
-extern void WalSummarizerShmemInit(void);
pg_noreturn extern void WalSummarizerMain(const void *startup_data, size_t startup_data_len);
extern void GetWalSummarizerState(TimeLineID *summarized_tli,
diff --git a/src/include/replication/logicalctl.h b/src/include/replication/logicalctl.h
index 495554c532c..0bc1302f130 100644
--- a/src/include/replication/logicalctl.h
+++ b/src/include/replication/logicalctl.h
@@ -14,8 +14,6 @@
#ifndef LOGICALCTL_H
#define LOGICALCTL_H
-extern Size LogicalDecodingCtlShmemSize(void);
-extern void LogicalDecodingCtlShmemInit(void);
extern void StartupLogicalDecodingStatus(bool last_status);
extern void InitializeProcessXLogLogicalInfo(void);
extern bool ProcessBarrierUpdateXLogLogicalInfo(void);
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 504b710536a..5f0c1b9c682 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -19,9 +19,6 @@ extern PGDLLIMPORT int max_parallel_apply_workers_per_subscription;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
-extern Size ApplyLauncherShmemSize(void);
-extern void ApplyLauncherShmemInit(void);
-
extern void ApplyLauncherForgetWorkerStartTime(Oid subid);
extern void ApplyLauncherWakeupAtCommit(void);
diff --git a/src/include/replication/origin.h b/src/include/replication/origin.h
index eb46b41b4b7..a69faf6eaaf 100644
--- a/src/include/replication/origin.h
+++ b/src/include/replication/origin.h
@@ -84,8 +84,4 @@ extern void replorigin_redo(XLogReaderState *record);
extern void replorigin_desc(StringInfo buf, XLogReaderState *record);
extern const char *replorigin_identify(uint8 info);
-/* shared memory allocation */
-extern Size ReplicationOriginShmemSize(void);
-extern void ReplicationOriginShmemInit(void);
-
#endif /* PG_ORIGIN_H */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..1a3557de607 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -327,10 +327,6 @@ extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
-/* shmem initialization functions */
-extern Size ReplicationSlotsShmemSize(void);
-extern void ReplicationSlotsShmemInit(void);
-
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index e546d0d050d..d2121cd3ed7 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -31,8 +31,6 @@ pg_noreturn extern void ReplSlotSyncWorkerMain(const void *startup_data, size_t
extern void ShutDownSlotSync(void);
extern bool SlotSyncWorkerCanRestart(void);
extern bool IsSyncingReplicationSlots(void);
-extern Size SlotSyncShmemSize(void);
-extern void SlotSyncShmemInit(void);
extern void SyncReplicationSlots(WalReceiverConn *wrconn);
#endif /* SLOTSYNC_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 85d24c87298..47c07574d4d 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -491,8 +491,6 @@ pg_noreturn extern void WalReceiverMain(const void *startup_data, size_t startup
extern void WalRcvRequestApplyReply(void);
/* prototypes for functions in walreceiverfuncs.c */
-extern Size WalRcvShmemSize(void);
-extern void WalRcvShmemInit(void);
extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index a4df3b8e0ae..8952c848d19 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -41,8 +41,6 @@ extern void WalSndErrorCleanup(void);
extern void PhysicalWakeupLogicalWalSnd(void);
extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
extern void WalSndSignals(void);
-extern Size WalSndShmemSize(void);
-extern void WalSndShmemInit(void);
extern void WalSndWakeup(bool physical, bool logical);
extern void WalSndInitStopping(void);
extern void WalSndWaitStopping(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index fa68e6ecece..ee3cb1dc203 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -375,8 +375,6 @@ typedef enum
/*
* function prototypes
*/
-extern void LockManagerShmemInit(void);
-extern Size LockManagerShmemSize(void);
extern void InitLockManagerAccess(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
diff --git a/src/include/storage/subsystemlist.h b/src/include/storage/subsystemlist.h
index d8e11756a61..5e092552c72 100644
--- a/src/include/storage/subsystemlist.h
+++ b/src/include/storage/subsystemlist.h
@@ -32,6 +32,9 @@ PG_SHMEM_SUBSYSTEM(DSMRegistryShmemCallbacks)
/* xlog, clog, and buffers */
PG_SHMEM_SUBSYSTEM(VarsupShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLOGShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogPrefetchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(XLogRecoveryShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CLOGShmemCallbacks)
PG_SHMEM_SUBSYSTEM(CommitTsShmemCallbacks)
PG_SHMEM_SUBSYSTEM(SUBTRANSShmemCallbacks)
@@ -40,12 +43,18 @@ PG_SHMEM_SUBSYSTEM(BufferManagerShmemCallbacks)
PG_SHMEM_SUBSYSTEM(StrategyCtlShmemCallbacks)
PG_SHMEM_SUBSYSTEM(BufTableShmemCallbacks)
+/* lock manager */
+PG_SHMEM_SUBSYSTEM(LockManagerShmemCallbacks)
+
/* predicate lock manager */
PG_SHMEM_SUBSYSTEM(PredicateLockShmemCallbacks)
/* process table */
PG_SHMEM_SUBSYSTEM(ProcGlobalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcArrayShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackendStatusShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(TwoPhaseShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(BackgroundWorkerShmemCallbacks)
/* shared-inval messaging */
PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
@@ -53,9 +62,27 @@ PG_SHMEM_SUBSYSTEM(SharedInvalShmemCallbacks)
/* interprocess signaling mechanisms */
PG_SHMEM_SUBSYSTEM(PMSignalShmemCallbacks)
PG_SHMEM_SUBSYSTEM(ProcSignalShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(CheckpointerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(AutoVacuumShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationSlotsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ReplicationOriginShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSndShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalRcvShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WalSummarizerShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(PgArchShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(ApplyLauncherShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SlotSyncShmemCallbacks)
/* other modules that need some shared memory space */
+PG_SHMEM_SUBSYSTEM(BTreeShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(SyncScanShmemCallbacks)
PG_SHMEM_SUBSYSTEM(AsyncShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(StatsShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitEventCustomShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(InjectionPointShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(WaitLSNShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(LogicalDecodingCtlShmemCallbacks)
+PG_SHMEM_SUBSYSTEM(DataChecksumsShmemCallbacks)
/* AIO subsystem. This delegates to the method-specific callbacks */
PG_SHMEM_SUBSYSTEM(AioShmemCallbacks)
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index ddd06304e97..a334e096e4a 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -298,14 +298,6 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
-/* ----------
- * Functions called from postmaster
- * ----------
- */
-extern Size BackendStatusShmemSize(void);
-extern void BackendStatusShmemInit(void);
-
-
/* ----------
* Functions called from backends
* ----------
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 27a2526524f..fabd1455c3c 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -46,9 +46,6 @@ typedef void (*InjectionPointCallback) (const char *name,
const void *private_data,
void *arg);
-extern Size InjectionPointShmemSize(void);
-extern void InjectionPointShmemInit(void);
-
extern void InjectionPointAttach(const char *name,
const char *library,
const char *function,
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 34c27cc3dc3..86ee348220d 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -42,8 +42,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
extern uint32 WaitEventExtensionNew(const char *wait_event_name);
extern uint32 WaitEventInjectionPointNew(const char *wait_event_name);
-extern void WaitEventCustomShmemInit(void);
-extern Size WaitEventCustomShmemSize(void);
extern char **GetWaitEventCustomNames(uint32 classId, int *nwaitevents);
/* ----------
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index d59c5ad0582..0f1af513673 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -107,9 +107,13 @@ extern PGDLLEXPORT void injection_wait(const char *name,
/* track if injection points attached in this process are linked to it */
static bool injection_point_local = false;
-/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void injection_shmem_request(void *arg);
+static void injection_shmem_init(void *arg);
+
+static const ShmemCallbacks injection_shmem_callbacks = {
+ .request_fn = injection_shmem_request,
+ .init_fn = injection_shmem_init,
+};
/*
* Routine for shared memory area initialization, used as a callback
@@ -126,44 +130,23 @@ injection_point_init_state(void *ptr, void *arg)
ConditionVariableInit(&state->wait_point);
}
-/* Shared memory initialization when loading module */
static void
-injection_shmem_request(void)
+injection_shmem_request(void *arg)
{
- Size size;
-
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- size = MAXALIGN(sizeof(InjectionPointSharedState));
- RequestAddinShmemSpace(size);
+ ShmemRequestStruct(.name = "injection_points",
+ .size = sizeof(InjectionPointSharedState),
+ .ptr = (void **) &inj_state,
+ );
}
static void
-injection_shmem_startup(void)
+injection_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_state = ShmemInitStruct("injection_points",
- sizeof(InjectionPointSharedState),
- &found);
-
- if (!found)
- {
- /*
- * First time through, so initialize. This is shared with the dynamic
- * initialization using a DSM.
- */
- injection_point_init_state(inj_state, NULL);
- }
-
- LWLockRelease(AddinShmemInitLock);
+ /*
+ * First time through, so initialize. This is shared with the dynamic
+ * initialization using a DSM.
+ */
+ injection_point_init_state(inj_state, NULL);
}
/*
@@ -601,9 +584,5 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Shared memory initialization */
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = injection_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = injection_shmem_startup;
+ RegisterShmemCallbacks(&injection_shmem_callbacks);
}
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index d7530681192..35efba1a5e3 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -28,7 +28,6 @@
#include "storage/bufmgr.h"
#include "storage/checksum.h"
#include "storage/condition_variable.h"
-#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/proc.h"
#include "storage/procnumber.h"
@@ -44,6 +43,7 @@
PG_MODULE_MAGIC;
+/* In shared memory */
typedef struct InjIoErrorState
{
ConditionVariable cv;
@@ -74,8 +74,15 @@ typedef struct BlocksReadStreamData
static InjIoErrorState *inj_io_error_state;
/* Shared memory init callbacks */
-static shmem_request_hook_type prev_shmem_request_hook = NULL;
-static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void test_aio_shmem_request(void *arg);
+static void test_aio_shmem_init(void *arg);
+static void test_aio_shmem_attach(void *arg);
+
+static const ShmemCallbacks inj_io_shmem_callbacks = {
+ .request_fn = test_aio_shmem_request,
+ .init_fn = test_aio_shmem_init,
+ .attach_fn = test_aio_shmem_attach,
+};
static PgAioHandle *last_handle;
@@ -83,70 +90,55 @@ static PgAioHandle *last_handle;
static void
-test_aio_shmem_request(void)
+test_aio_shmem_request(void *arg)
{
- if (prev_shmem_request_hook)
- prev_shmem_request_hook();
-
- RequestAddinShmemSpace(sizeof(InjIoErrorState));
+ ShmemRequestStruct(.name = "test_aio injection points",
+ .size = sizeof(InjIoErrorState),
+ .ptr = (void **) &inj_io_error_state,
+ );
}
static void
-test_aio_shmem_startup(void)
+test_aio_shmem_init(void *arg)
{
- bool found;
-
- if (prev_shmem_startup_hook)
- prev_shmem_startup_hook();
-
- /* Create or attach to the shared memory state */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- inj_io_error_state = ShmemInitStruct("injection_points",
- sizeof(InjIoErrorState),
- &found);
-
- if (!found)
- {
- /* First time through, initialize */
- inj_io_error_state->enabled_short_read = false;
- inj_io_error_state->enabled_reopen = false;
- inj_io_error_state->enabled_completion_wait = false;
+ /* First time through, initialize */
+ inj_io_error_state->enabled_short_read = false;
+ inj_io_error_state->enabled_reopen = false;
+ inj_io_error_state->enabled_completion_wait = false;
- ConditionVariableInit(&inj_io_error_state->cv);
- inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
+ ConditionVariableInit(&inj_io_error_state->cv);
+ inj_io_error_state->completion_wait_event = WaitEventInjectionPointNew("completion_wait");
#ifdef USE_INJECTION_POINTS
- InjectionPointAttach("aio-process-completion-before-shared",
- "test_aio",
- "inj_io_completion_hook",
- NULL,
- 0);
- InjectionPointLoad("aio-process-completion-before-shared");
-
- InjectionPointAttach("aio-worker-after-reopen",
- "test_aio",
- "inj_io_reopen",
- NULL,
- 0);
- InjectionPointLoad("aio-worker-after-reopen");
+ InjectionPointAttach("aio-process-completion-before-shared",
+ "test_aio",
+ "inj_io_completion_hook",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-process-completion-before-shared");
+
+ InjectionPointAttach("aio-worker-after-reopen",
+ "test_aio",
+ "inj_io_reopen",
+ NULL,
+ 0);
+ InjectionPointLoad("aio-worker-after-reopen");
#endif
- }
- else
- {
- /*
- * Pre-load the injection points now, so we can call them in a
- * critical section.
- */
+}
+
+static void
+test_aio_shmem_attach(void *arg)
+{
+ /*
+ * Pre-load the injection points now, so we can call them in a critical
+ * section.
+ */
#ifdef USE_INJECTION_POINTS
- InjectionPointLoad("aio-process-completion-before-shared");
- InjectionPointLoad("aio-worker-after-reopen");
- elog(LOG, "injection point loaded");
+ InjectionPointLoad("aio-process-completion-before-shared");
+ InjectionPointLoad("aio-worker-after-reopen");
+ elog(LOG, "injection point loaded");
#endif
- }
-
- LWLockRelease(AddinShmemInitLock);
}
void
@@ -155,10 +147,7 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- prev_shmem_request_hook = shmem_request_hook;
- shmem_request_hook = test_aio_shmem_request;
- prev_shmem_startup_hook = shmem_startup_hook;
- shmem_startup_hook = test_aio_shmem_startup;
+ RegisterShmemCallbacks(&inj_io_shmem_callbacks);
}
--
2.47.3
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-05 23:28 Heikki Linnakangas <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-05 23:28 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; Ashutosh Bapat <[email protected]>; +Cc: Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 05/04/2026 23:06, Heikki Linnakangas wrote:
> Here's patch version 12 [*]. I believe I've addressed all the feedback,
> and I feel this is in pretty good shape now. There hasn't been any big
> design changes lately.
>
> One notable change is that I replaced the separate {request|init|attach}
> _fn_arg fields in ShmemCallbacks with a single 'opaque_arg' field, and
> added a brief comment to it. You both commented on whether we need that
> at all, and maybe you're right that we don't, but at least it's now just
> one field rather than three. As before, callers can simply ignore it if
> they don't need it.
After another round of comment cleanups and such, committed. Thanks!
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-06 13:53 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-06 13:53 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Mon, Apr 6, 2026 at 4:58 AM Heikki Linnakangas <[email protected]> wrote:
>
> On 05/04/2026 23:06, Heikki Linnakangas wrote:
> > Here's patch version 12 [*]. I believe I've addressed all the feedback,
> > and I feel this is in pretty good shape now. There hasn't been any big
> > design changes lately.
> >
> > One notable change is that I replaced the separate {request|init|attach}
> > _fn_arg fields in ShmemCallbacks with a single 'opaque_arg' field, and
> > added a brief comment to it. You both commented on whether we need that
> > at all, and maybe you're right that we don't, but at least it's now just
> > one field rather than three. As before, callers can simply ignore it if
> > they don't need it.
>
> After another round of comment cleanups and such, committed. Thanks!
Thanks. Attached are rebased patches.
0001 is the same patch as submitted before to support resizable shared
memory structures.
0002 changes resizable_shmem_usage() in the resizable_shmem test
module to use /proc/self/smaps as suggested by Andres offline. Using
smaps the function estimates the actual memory mapped against the main
shared memory segment. Expecting this to fix failure reported on
https://cirrus-ci.com/task/5501660157444096.
0003 adds some more diagnostic about shared memory segments in the
server log to debug the failure in case it appears even with 0002
0004 addresses Matthias's comments in the discussion below
On Mon, Apr 6, 2026 at 12:35 AM Matthias van de Meent
<[email protected]> wrote:
>
> On Sun, 5 Apr 2026 at 13:20, Ashutosh Bapat
> <[email protected]> wrote:
> >
> > On Sun, Apr 5, 2026 at 2:36 PM Matthias van de Meent
> > <[email protected]> wrote:
> > >
> > > On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <[email protected]> wrote:
> > > >
> > > > On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> > > > <[email protected]> wrote:
> > > > >
> > >
> > > I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
> > > its platforms to make it part of the exposed ABI for Shmem? I think
> > > that we should expose the same functions and structs, and just have
> > > the shmem internals throw an error if the configuration used by the
> > > user implies the user wants to update shmem sizing when the system
> > > doesn't support it. That would avoid extensions having to recompile
> > > between have/have not systems that have an otherwise compatible ABI;
> > > especially when those extensions don't actually need the resizeable
> > > part of the shmem system.
> > >
> >
> > I don't think I understand this fully. An extension may want to
> > support a structure in both modes - fixed as well as resizable
> > depending upon whether the latter is supported. If the structure has
> > maximum_size always the extension code needs to set it to 0 when the
> > resizable shared structure is not supported and set to actual
> > maximum_size when the resizable structure is supported. Without a
> > macro or some flag they can not do that. The flag/macro then becomes
> > part ABI for shmem. Am I correct?
>
> That's not quite what I meant.
>
> With your patch, the size and field offsets in `struct
> ShmemStructOpts` changes depending only on HAVE_RESIZABLE_SHMEM, as
> does function's availability. This means that an extension that's
> built without HAVE_RESIZABLE_SHMEM (an otherwise identical system)
> can't correctly be loaded into a server that does have
> HAVE_RESIZABLE_SHMEM defined - or at least it'll misbehave when it
> tries to use the new shmem system without trying out resizeable areas.
>
> If instead the fields used for definining resizable shmem areas (and
> the relevant functions) are always defined, but with runtime checks to
> make sure that in !HAVE_RESIZEABLE_SHMEM nobody tries to use the
> resizing functionality, then that'd reduce the unchecked hidden
> incompatibility; assuming that no extension manually does memory
> management syscall operations on those shmem areas.
>
> > Since extension binaries need to be
> > built on different platforms anyway, that would automatically take
> > care of building with or without HAVE_RESIZABLE_SHMEM. I feel it makes
> > testing simpler since run time behaviour is fixed. Maybe I am missing
> > something. Maybe a code diff or some example platform might make it
> > more clear for me.
>
> I'm not entirely sure it would be automatic. Is it guaranteed that
> HAVE_RESIZABLE_SHMEM won't change over the lifetime of any
> distribution's platform? Because it's definitely not apparent to me
> that rebuilding the new server version against an upgraded platform
> (now possibly with HAVE_RESIZABLE_SHMEM) should also mean rebuilding
> the extensions that have been built against a previous minor version
> (without HAVE_RESIZABLE_SHMEM).
>
Please review changes in 0004 to see if they address your concerns.
The structures and functions are same irrespective of
HAVE_RESIZABLE_SHMEM. If HAVE_RESIZABLE_SHMEM is not defined, but
maximum_size > 0, then the request to add this structure will fail.
Thus on a server running with binary built with HAVE_RESIZABLE_SHMEM
undefined, maximum_size = 0 always. Many of the code blocks don't need
#ifdef HAVE_RESIZABLE_SHMEM blocks then. I think code this way is much
more readable. However binary will contain more instructions than
necessary - maybe the compiler can optimize those out.
> > > > For now, it
> > > > seems only for the sanity checks, but it could be seen as a useful
> > > > safety feature. A difference in maximum_size and minimum_size would
> > > > indicate that the structure is resizable.
> > >
> > > I think that's the right approach.
> >
> >
> > I also think that introducing minimum_size is useful. Let's hear from
> > Heikki before implementing it, in case he has a different opinion. I
> > am not sure about min_allocated_space though - what use do you see for
> > it. reserved_space is useful in pg_shmem_allocations() C function
> > itself and gives impact to the fully grown structure. What would
> > min_allocated_space give us? If at all it would be min_allocated_size
> > not space since reserved space will never change. But even that I am
> > not sure about.
>
> I'd say it's mostly interesting for people looking at or debugging
> shmem allocations. Which isn't a huge group of developers or DBAs, but
> if we're exposing data like this, and are going to allow resizing,
> then someone could see some benefits from this.
>
> E.g., it may be useful to have the information to see how low the
> currently running server can scale down its memory usage, so that the
> admin can see whether a reboot is required if they want to allow it to
> scale it down further (assuming there's a lower limit for allocations
> - some shmem structs may have a lower scaling limit defined at
> startup, while others may be able to scale linearly from 0 to 100)
I guess, most of the time the lower limit will be a hard lower limit
which could not be changed, so I guess the usecases are pretty narrow.
0005 adds minimum_size member alongside maximum_size as per the
discussion starting [1]. I like the end result since maximum_size is
managed consistently across fixed-size and resizable structures and
also the condition to check whether a structure is resizable or fixed
is more natural/logical.
0006 adds support to add mprotect appropriately on the used and unused
part of a resizable structure, again per discussion starting [1]. This
is still a bit rough, but review comments are welcome.
I have kept these two patches separate from the main patch so that I
can remove them if others feel they are not worth including in the
feature.
[1] https://www.postgresql.org/message-id/[email protected]...
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] v20260406-0001-resizable-shared-memory-structures.patch (66.6K, 2-v20260406-0001-resizable-shared-memory-structures.patch)
download | inline diff:
From 06b8669cffcf09294706c536ade397a726ed0633 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH v20260406 1/6] resizable shared memory structures
Resizable shared memory structures can be allocated by specifying a new
member ShmemStructOpts::maximum_size. At the startup or when the
structure is created, we reserve address space worth maximum_size in the
shared memory segment. It is expected that the subsystem which creates
the structure would initialize only the initial size worth of memory
when creating it. In an mmap'ed memory, this should allocate memory
worth the initial size. It should not allocate maximum_size worth of
memory initially. As the structure is resized using ShmemResizeStruct()
memory is freed or allocated in chunks of memory pages when shrinking
and expanding the structure respectively.
Resizable shared memory feature depends upon existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check existence of madvise
separately, since existence of the constants implies existence of the
function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since that's
largely used for Windows where the APIs to free and allocate memory from
and to a given address space are not known to the author right now.
Given that PostgreSQL is used widely on Linux, providing this feature on
Linux covers benefits most of its users. Once we figure out the required
Windows APIs, we will support this feature on Windows as well.
The feature is also not available when Sys-V shared memory is used even
on Linux since we do not know whether required Sys-V APIs exist; mostly
they don't. Since that combination is only available for development and
testing, not supporting the feature there isn't going to impact
PostgreSQL users.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not support
the feature. But we also have run time checks to disable this feature
when Sys-V shared memory is used. In order to know whether a given
instance of running server supports resizable structures, we have
introduced GUC have_resizable_shmem.
Author: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
---
configure.ac | 4 +
doc/src/sgml/config.sgml | 15 +
doc/src/sgml/system-views.sgml | 30 +-
doc/src/sgml/xfunc.sgml | 54 +++
meson.build | 16 +
src/backend/port/sysv_shmem.c | 69 ++++
src/backend/port/win32_shmem.c | 23 ++
src/backend/storage/ipc/shmem.c | 269 +++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/guc_tables.c | 7 +
src/include/catalog/pg_proc.dat | 4 +-
src/include/pg_config.h.in | 8 +
src/include/pg_config_manual.h | 9 +
src/include/storage/pg_shmem.h | 5 +
src/include/storage/shmem.h | 16 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 25 ++
src/test/modules/resizable_shmem/meson.build | 36 ++
.../resizable_shmem/resizable_shmem--1.0.sql | 37 ++
.../modules/resizable_shmem/resizable_shmem.c | 326 ++++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 4 +
.../resizable_shmem/t/001_resizable_shmem.pl | 239 +++++++++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 23 ++
.../modules/test_shmem/test_shmem--1.0.sql | 4 +
src/test/modules/test_shmem/test_shmem.c | 20 ++
src/test/regress/expected/rules.out | 6 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 1223 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/configure.ac b/configure.ac
index ff5dd64468e..7acd844ccb2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1895,6 +1895,10 @@ AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
+# Linux-specific madvise constants needed for resizable shared memory. See similar checks in meson.build for explanation of why these checks are here.
+AC_CHECK_DECLS([MADV_POPULATE_WRITE], [], [], [#include <sys/mman.h>])
+AC_CHECK_DECLS([MADV_REMOVE], [], [], [#include <sys/mman.h>])
+
AC_REPLACE_FUNCS(m4_normalize([
explicit_bzero
getopt
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b44231a362d..7a01f2cf967 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -12114,6 +12114,21 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-have-resizable-shmem" xreflabel="have_resizable_shmem">
+ <term><varname>have_resizable_shmem</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>have_resizable_shmem</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with <literal>HAVE_RESIZABLE_SHMEM</literal> enabled and supports
+ <link linkend="xfunc-shared-addin-resizable">Resizable shared memory structures</link>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status">
<term><varname>huge_pages_status</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2ebec6928d5..9717f8434bb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the resizable allocation can grow to. Zero for
+ fixed-size allocations, for anonymous allocations, and for free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>reserved_space</structfield> <type>int8</type>
+ </para>
+ <para>
+ Address space reserved for the allocation in bytes. For resizable
+ structures, this is the total address space reserved to accommodate
+ growth up to <structfield>maximum_size</structfield>, and is greater
+ than or equal to <structfield>allocated_size</structfield>. For
+ fixed-size allocations, anonymous allocations, and free memory this
+ is same as <structfield>allocated_size</structfield>.
</para></entry>
</row>
</tbody>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 789cac9fcab..3d25139c334 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3744,6 +3744,60 @@ my_shmem_init(void *arg)
</para>
</sect3>
+ <sect3 id="xfunc-shared-addin-resizable">
+ <title>Resizable shared memory structures</title>
+
+ <para>
+ A resizable memory structure can be requested using
+ <function>ShmemRequestStruct</function> by passing
+ <parameter>.maximum_size</parameter> along with
+ <parameter>.size</parameter>. <parameter>.maximum_size</parameter> is
+ maximum size upto which the structure can grow where as
+ <parameter>.size</parameter> is the initial size of the structure. While
+ contiguous address space worth <parameter>maximum_size</parameter> is
+ allocated to the structure, only memory worth <parameter>size</parameter>
+ bytes is allocated initially. The <function>init_fn</function> should only
+ initialize the <parameter>size</parameter> amount of memory. The actual
+ memory allocated to this structure at any point in time is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_size</structfield></link>
+ and the address space reserved for this structure is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>reserved_space</structfield></link>.
+ </para>
+
+ <para>
+ The structure can be resized using <function>ShmemResizeStruct</function> by
+ passing it the structure's <structname>ShmemStructDesc</structname> and the
+ new size which can be anywhere between 0 to
+ <parameter>maximum_size</parameter>. If the new size is smaller than the
+ current size of the structure, the memory between the new size and current
+ size is freed while keeping the contents of the memory upto new size intact.
+ If the new size is greater than the current size, memory is allocated upto
+ new size while keeping the current contents of the structure intact. The
+ starting address of the structure does not change because of resizing
+ operation. The caller may need to take care of the additional
+ synchronization between the resizing process and the processes using the
+ shared structure. Also accessing the memory beyond the current size of the
+ structure will not cause any segmentation fault or a bus error. Memory will
+ be allocated during such a write access. 0s will be returned on such a read
+ access if memory is not allocated yet. The additional synchronization may
+ use mprotect() with PROT_NONE in every backend that may access this memory
+ to ensure that such an access results in a fault.
+ </para>
+
+ <para>
+ This functionality is available only on the platforms which provide the APIs
+ necessary to reserve contiguous address space and to allocate or free memory
+ in that address space on demand. Macro <symbol>HAVE_RESIZABLE_SHMEM</symbol>
+ is defined on such platforms. It can be used to guard code related to
+ resizing a shared memory structure. The functionality is available on with
+ mmap'ed memory, so subsystems which use resizable structures may have to
+ addtionally disable resizable memory usage when <symbol>shared_memory_type</symbol> is not
+ <symbol>SHMEM_TYPE_MMAP</symbol>. A GUC <xref linkend="guc-have-resizable-shmem"/> is set to
+ <literal>on</literal> when this functionality is available in a running
+ server, <literal>off</literal> otherwise.
+ </para>
+ </sect3>
+
<sect3 id="xfunc-shared-addin-dynamic">
<title>Allocating Dynamic Shared Memory After Startup</title>
diff --git a/meson.build b/meson.build
index 43d5ffc30b1..790845762e1 100644
--- a/meson.build
+++ b/meson.build
@@ -2904,6 +2904,22 @@ decl_checks = [
['timingsafe_bcmp', 'string.h'],
]
+# Linux-specific madvise constants needed for resizable shared memory.
+# Usually we use AC_CHECK_DECLS to check for function declarations, but in this
+# case we are using it to detect existence of constants. These constants are
+# used to define HAVE_RESIZABLE_SHMEM which is used in storage/pg_shmem.h as
+# well as storage/shmem.h. The first abstracts the APIs to allocate shared
+# memory segments from the operating system whereas the second abstracts APIs to
+# allocate shared memory to various subsystems. Since they are related but
+# orthogonal to each other, including any one of them in the other file doesn't
+# make sense. pg_config_manual.h is the only place where HAVE_RESIZABLE_SHMEM
+# can be defined and made available to both without including sys/mman.h. But
+# for that we need constants that indicate the existence of following defines.
+decl_checks += [
+ ['MADV_POPULATE_WRITE', 'sys/mman.h'],
+ ['MADV_REMOVE', 'sys/mman.h'],
+]
+
# Need to check for function declarations for these functions, because
# checking for library symbols wouldn't handle deployment target
# restrictions on macOS
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..8d859dfbbfb 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,51 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be freed")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not free shared memory: %m")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be allocated at runtime")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not allocate shared memory: %m")));
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..dc2ee018845 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -648,3 +648,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 1ebffe5a32a..03de5d88d51 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,11 +19,11 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * This module provides facilities to allocate fixed-size structures in shared
- * memory, for things like variables shared between all backend processes.
- * Each such structure has a string name to identify it, specified when it is
- * requested. shmem_hash.c provides a shared hash table implementation on top
- * of that.
+ * This module provides facilities to allocate fixed-size as well as resizable
+ * structures in shared memory, for things like variables shared between all
+ * backend processes. Each such structure has a string name to identify it,
+ * specified when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of fixed-size structures.
*
* Shared memory areas should usually not be allocated after postmaster
* startup, although we do allow small allocations later for the benefit of
@@ -102,6 +102,21 @@
* (*options->ptr), and calls the attach_fn callback, if any, for additional
* per-backend setup.
*
+ * Resizable shared memory structures
+ * ----------------------------------
+ *
+ * In order to allocate resizable shared memory structures, set
+ * ShmemRequestStructOpts::maximum_size to the maximum size that the structure
+ * can grow to. The address space for the maximum size will be reserved at
+ * startup, but memory is allocated or freed as the structure grows or shrinks
+ * respectively. ShmemRequestStructOpts::size should be set to the initial size
+ * of the structure, which is the amount of memory allocated at the startup.
+ * After startup, the structure can be resized by calling ShmemResizeStruct() by
+ * passing it the ShmemStructDesc for the structure and the new size.
+ *
+ * While resizable structures can be created after the startup, the memory
+ * available for them is quite limited.
+ *
* Legacy ShmemInitStruct()/ShmemInitHash() functions
* --------------------------------------------------
*
@@ -167,6 +182,18 @@ typedef struct
ShmemRequestKind kind;
} ShmemRequest;
+/*
+ * A convenient macro to get the space required for a shmem request consistently.
+ * A resizable structure, requested by non-zero maximum_size, requires space for
+ * its maximum size.
+ */
+#ifdef HAVE_RESIZABLE_SHMEM
+#define SHMEM_REQUEST_SPACE_SIZE(request) \
+ ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+#else
+#define SHMEM_REQUEST_SPACE_SIZE(request) ((request)->options->size)
+#endif
+
static List *pending_shmem_requests;
/*
@@ -269,6 +296,10 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size maximum_size; /* the maximum size the structure can grow to */
+ Size reserved_space; /* the total address space reserved */
+#endif
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -277,6 +308,9 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+#ifdef HAVE_RESIZABLE_SHMEM
+static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
+#endif
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -347,6 +381,11 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
else
{
@@ -355,12 +394,28 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->maximum_size < 0)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+#endif
}
if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0 && options->size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
+ options->name, options->maximum_size, options->size);
+
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
/* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -382,8 +437,13 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
}
/*
- * ShmemGetRequestedSize() --- estimate the total size of all registered shared
- * memory structures.
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
+ *
+ * When maximum_size is specified when requesting resizable shared memory
+ * structures. We use that, instead of the (initial) size, for the estimation,
+ * to ensure that enough space is reserved for growing the resizable structures
+ * to its maximum size.
*
* This is called at postmaster startup, before the shared memory segment has
* been created.
@@ -408,7 +468,7 @@ ShmemGetRequestedSize(void)
alignment = PG_CACHE_LINE_SIZE;
size = TYPEALIGN(alignment, size);
- size = add_size(size, request->options->size);
+ size = add_size(size, SHMEM_REQUEST_SPACE_SIZE(request));
}
return size;
@@ -515,6 +575,7 @@ InitShmemIndexEntry(ShmemRequest *request)
ShmemIndexEnt *index_entry;
bool found;
size_t allocated_size;
+ size_t requested_size;
void *structPtr;
/* look it up in the shmem index */
@@ -532,10 +593,18 @@ InitShmemIndexEntry(ShmemRequest *request)
}
/*
- * We inserted the entry to the shared memory index. Allocate requested
- * amount of shared memory for it, and initialize the index entry.
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of address space in the shared memory segment for it, and do
+ * basic initializion. The memory gets allocated during initialization as
+ * the corresponding memory pages are written to. Allocate enough space
+ * for a resizable structure to grow to its maximum size. It is expected
+ * that the initialization callback will use only as much memory as the
+ * initial size of the resizable structure. (Well, if it doesn't, more
+ * memory will be allocated initially than expected, no further harm is
+ * done.)
*/
- structPtr = ShmemAllocRaw(request->options->size,
+ requested_size = SHMEM_REQUEST_SPACE_SIZE(request);
+ structPtr = ShmemAllocRaw(requested_size,
request->options->alignment,
&allocated_size);
if (structPtr == NULL)
@@ -544,13 +613,22 @@ InitShmemIndexEntry(ShmemRequest *request)
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
+ errmsg("not enough shared memory space for data structure"
" \"%s\" (%zu bytes requested)",
- name, request->options->size)));
+ name, requested_size)));
}
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
+#ifdef HAVE_RESIZABLE_SHMEM
+ index_entry->reserved_space = allocated_size;
+ index_entry->maximum_size = request->options->maximum_size;
+ if (request->options->maximum_size > 0)
+ {
+ /* Adjust allocated size of a resizable structure. */
+ index_entry->allocated_size = EstimateAllocatedSize(index_entry);
+ }
+#endif
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -595,7 +673,7 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the size in the index matches the request */
+ /* Check that the sizes in the index match the request. */
if (index_entry->size != request->options->size &&
request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -605,6 +683,18 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
+#ifdef HAVE_RESIZABLE_SHMEM
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+#endif
+
/*
* Re-establish the caller's pointer variable, or do other actions to
* attach depending on the kind of shmem area it is.
@@ -626,6 +716,115 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
+#ifdef HAVE_RESIZABLE_SHMEM
+/*
+ * Estimate the actual memory allocated for a resizable structure.
+ *
+ * ... based on the assumption that the memory is allocated in pages.
+ *
+ * The memory pages covered by the current size of a resizable structure are
+ * fully allocated when the currently allocated part of the structure is written
+ * to. The memory page where the maximal structure ends also hosts the next
+ * structure, unless the maximal structure ends on a page boundary. Hence that
+ * page is allocated when the next structure is written to. The memory pages
+ * between the page where the current structure ends and the page where the next
+ * structure starts remain unallocated. Thus the memory allocated for a
+ * resizable structure can be estimated as the total address space reserved for
+ * the structure minus the unallocated memory pages between the current end and
+ * the next structure.
+ */
+static Size
+EstimateAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ Assert(entry->maximum_size >= entry->size);
+ Assert(entry->reserved_space >= entry->maximum_size);
+
+ if (align_end < floor_max_end)
+ return entry->reserved_space - (floor_max_end - align_end);
+
+ return entry->reserved_space;
+}
+
+/*
+ * ShmemResizeStruct() --- resize a resizable shared memory structure.
+ *
+ * If the structure is being shrunk, the memory pages that are no longer needed
+ * are freed. If the structure is being expanded, the memory pages that are
+ * needed for the new size are allocated. See EstimateAllocatedSize() for
+ * explanation of which pages are allocated for a resizable structure.
+ */
+void
+ShmemResizeStruct(const char *name, Size new_size)
+{
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /*
+ * Resizable shared memory structures are only supported with mmap'ed
+ * memory.
+ */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ Assert(result);
+
+ if (result->maximum_size <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\"" \
+ "(required %zu bytes, reserved %zu bytes)",
+ name, new_size, result->maximum_size)));
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the
+ * start of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+
+ if (max_end > new_end)
+ PGSharedMemoryEnsureFreed(new_end, max_end - new_end);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+
+ if (new_end > struct_start)
+ PGSharedMemoryEnsureAllocated(struct_start, new_end - struct_start);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ result->allocated_size = EstimateAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+}
+#endif /* HAVE_RESIZABLE_SHMEM */
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -732,6 +931,10 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(!found);
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
+#ifdef HAVE_RESIZABLE_SHMEM
+ result->maximum_size = 0;
+ result->reserved_space = result->allocated_size;
+#endif
result->location = ShmemAllocator->index;
}
}
@@ -1075,7 +1278,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 6
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1097,7 +1300,23 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
+#ifdef HAVE_RESIZABLE_SHMEM
+ values[4] = Int64GetDatum(ent->maximum_size);
+ values[5] = Int64GetDatum(ent->reserved_space);
+
+ /*
+ * Keep track of the total reserved space for named shmem areas, to be
+ * able to calculate the amount of shared memory allocated for
+ * anonymous areas and the amount of free shared memory at the end of
+ * the segment.
+ */
+ named_allocated += ent->reserved_space;
+#else
+ values[4] = Int64GetDatum(0);
+ values[5] = Int64GetDatum(ent->allocated_size);
+
named_allocated += ent->allocated_size;
+#endif
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -1108,6 +1327,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1116,6 +1337,8 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = Int64GetDatum(0);
+ values[5] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -1303,23 +1526,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7a8a5d0764c..18cb2516d9a 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1211,6 +1211,13 @@
max => '1000.0',
},
+{ name => 'have_resizable_shmem', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+ short_desc => 'Shows whether the running server supports resizable shared memory.',
+ flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE',
+ variable => 'have_resizable_shmem_enabled',
+ boot_val => 'HAVE_RESIZABLE_SHMEM_ENABLED',
+},
+
{ name => 'hba_file', type => 'string', context => 'PGC_POSTMASTER', group => 'FILE_LOCATIONS',
short_desc => 'Sets the server\'s "hba" configuration file.',
flags => 'GUC_SUPERUSER_ONLY',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d9ca13baff9..6bb08dd10f1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -653,6 +653,13 @@ static bool assert_enabled = DEFAULT_ASSERT_ENABLED;
#endif
static bool exec_backend_enabled = EXEC_BACKEND_ENABLED;
+#ifdef HAVE_RESIZABLE_SHMEM
+#define HAVE_RESIZABLE_SHMEM_ENABLED true
+#else
+#define HAVE_RESIZABLE_SHMEM_ENABLED false
+#endif
+static bool have_resizable_shmem_enabled = HAVE_RESIZABLE_SHMEM_ENABLED;
+
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3ea17fc5629..32945d73f36 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8702,8 +8702,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,maximum_size,reserved_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9f6d512347e..8f2a59ec3a8 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -85,6 +85,14 @@
don't. */
#undef HAVE_DECL_F_FULLFSYNC
+/* Define to 1 if you have the declaration of `MADV_POPULATE_WRITE', and to 0
+ if you don't. */
+#undef HAVE_DECL_MADV_POPULATE_WRITE
+
+/* Define to 1 if you have the declaration of `MADV_REMOVE', and to 0 if you
+ don't. */
+#undef HAVE_DECL_MADV_REMOVE
+
/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
don't. */
#undef HAVE_DECL_MEMSET_S
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 521b49b8888..b09d6c91324 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -131,6 +131,15 @@
#define EXEC_BACKEND
#endif
+/*
+ * HAVE_RESIZABLE_SHMEM indicates whether resizable shared memory structures are
+ * supported. The implementation requires Linux-specific madvise constants
+ * (MADV_REMOVE and MADV_POPULATE_WRITE).
+ */
+#if HAVE_DECL_MADV_REMOVE && HAVE_DECL_MADV_POPULATE_WRITE && !defined(EXEC_BACKEND)
+#define HAVE_RESIZABLE_SHMEM
+#endif
+
/*
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
* posix_fadvise() kernel call. Usually the automatic configure tests are
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..3d5aceba59c 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,11 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+#endif
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index af7fe893bc4..f356027e500 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,19 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for large resizable structures which need several pages
+ * worth of contiguous memory. Should be set to 0 for fixed-size
+ * structures.
+ */
+ ssize_t maximum_size;
+#endif
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
@@ -168,6 +181,9 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+#ifdef HAVE_RESIZABLE_SHMEM
+extern void ShmemResizeStruct(const char *name, Size new_size);
+#endif
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index f1b04c99969..2a1e746bf0c 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index fc99552d9ab..cd94e1fea15 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..86bf17bef4a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,25 @@
+# src/test/modules/resizable_shmem/Makefile
+
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+MODULES = resizable_shmem
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..c1bcb6117b6
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,37 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..5ae2d2e2d1d
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,326 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "commands/extension.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/* Default values for the GUCs controlling structure size */
+#define TEST_INITIAL_ENTRIES_DEFAULT (25 * 1024 * 1024) /* ~100MB */
+#define TEST_MAX_ENTRIES_DEFAULT (100 * 1024 * 1024) /* ~400MB */
+
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * The test performs resizing, reads or writes, only one at a time and never
+ * concurrently. Hence, there is no need for locks in the test structure.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+/* GUC variables controlling the size of the test structure */
+static int test_initial_entries;
+static int test_max_entries;
+
+/* Whether to use SHMEM_ATTACH_UNKNOWN_SIZE when attaching to the shared memory */
+static bool use_unknown_size = false;
+
+static void resizable_shmem_request(void *arg);
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemCallbacks shmem_callbacks = {
+ .request_fn = resizable_shmem_request,
+ .init_fn = resizable_shmem_shmem_init,
+};
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ int guc_context;
+
+ /*
+ * Use PGC_POSTMASTER when loaded at startup so the values are fixed once
+ * the shared memory segment is created. When loaded after startup
+ * PGC_POSTMASTER is not allowed, so we use PGC_SIGHUP instead. Although
+ * we do not intend to change these values at config reload, PGC_SIGHUP is
+ * the least permissive context that allows defining the GUC after startup
+ * and still prevents it from being changed via SET.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ guc_context = PGC_POSTMASTER;
+ else
+ {
+ guc_context = PGC_SIGHUP;
+ shmem_callbacks.flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP;
+ }
+
+ DefineCustomIntVariable("resizable_shmem.initial_entries",
+ "Initial number of entries in the test structure.",
+ NULL,
+ &test_initial_entries,
+ TEST_INITIAL_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ DefineCustomIntVariable("resizable_shmem.max_entries",
+ "Maximum number of entries in the test structure.",
+ NULL,
+ &test_max_entries,
+ TEST_MAX_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ /*
+ * When loaded after startup by a backend that is not creating the
+ * extension, the shared memory might have been resized to a size other
+ * than the initial size. Use SHMEM_ATTACH_UNKNOWN_SIZE to attach without
+ * knowing the exact size.
+ */
+ if (!process_shared_preload_libraries_in_progress && !creating_extension)
+ use_unknown_size = true;
+
+ RegisterShmemCallbacks(&shmem_callbacks);
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void *arg)
+{
+ Size initial_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_max_entries, TEST_ENTRY_SIZE));
+
+ /* A preprocessor macro to conditionally include the maximum_size field. */
+#define MAXIMUM_SIZE_ARG .maximum_size = max_size,
+#else
+#define MAXIMUM_SIZE_ARG
+#endif
+
+ /* Register our resizable shared memory structure */
+ ShmemRequestStruct(.name = "resizable_shmem",
+ .size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+ MAXIMUM_SIZE_ARG
+ .ptr = (void **) &resizable_shmem,
+ );
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+ resizable_shmem->num_entries = test_initial_entries;
+ memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(new_entries, TEST_ENTRY_SIZE));
+ ShmemResizeStruct("resizable_shmem", new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * The function assumes that these values will be available in
+ * /proc/self/status, any system which also support madvise with MADV_REMOVE and
+ * MADV_POPULATE_WRITE.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+ FILE *f;
+ char line[256];
+ int64 rss_anon_kb = -1;
+ int64 rss_file_kb = -1;
+ int64 rss_shmem_kb = -1;
+ int64 vm_size_kb = -1;
+ int found = 0;
+ TupleDesc tupdesc;
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+
+ /* Open /proc/self/status to read memory information */
+ f = fopen("/proc/self/status", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/status: %m")));
+
+ /* Look for the memory usage lines */
+ while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ {
+ if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+ found++;
+ else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+ found++;
+ else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+ found++;
+ else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+ found++;
+ }
+
+ fclose(f);
+
+ /* Build tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept a record")));
+
+ /* Build the result tuple */
+ values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+ values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+ values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+ values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..8031303fe0e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,4 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..6d45b1eccdc
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,239 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality, both when loaded at startup via
+# shared_preload_libraries and when loaded after startup (late allocation).
+
+# Verify that RssShmem does not exceed the total allocated shared memory.
+# Allocated shared memory should be mostly the memory allocated to the
+# resizable_shmem structure. Any large increase in expected RssShmem should
+# reflect the unexpected increase in memory allocated to the resizable_shmem
+# structure.
+sub check_shmem_usage
+{
+ my ($session, $label, $node) = @_;
+
+ my $rss_shmem = $session->query_safe('SELECT rss_shmem FROM resizable_shmem_usage();',
+ verbose => 0);
+ my $total_alloc = $node->safe_psql('postgres',
+ "SELECT sum(allocated_size) FROM pg_shmem_allocations;");
+
+ note "$label: RssShmem=$rss_shmem, sum(allocated_size)=$total_alloc";
+ ok($rss_shmem <= $total_alloc, "$label: RssShmem does not exceed total allocated size");
+}
+
+# Test a resize operation: resize, verify old data, write new data, verify
+# new data, and check shmem usage. Returns updated ($num_entries, $value).
+sub test_resize
+{
+ my ($node, $prefix, $old_num_entries, $old_value, $new_num_entries, $new_value, $label) = @_;
+
+ $label = "$prefix: $label";
+
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $session1->query_safe("SELECT resizable_shmem_resize($new_num_entries);",
+ verbose => 0);
+
+ # Old data should still be intact in the (possibly smaller) area
+ my $readable_entries = ($new_num_entries < $old_num_entries) ? $new_num_entries : $old_num_entries;
+ is($session1->query_safe("SELECT resizable_shmem_read($readable_entries, $old_value);",
+ verbose => 0),
+ 't', "old data readable after $label");
+
+ $session2->query_safe("SELECT resizable_shmem_write($new_value);",
+ verbose => 0);
+ is($session1->query_safe("SELECT resizable_shmem_read($new_num_entries, $new_value);",
+ verbose => 0),
+ 't', "new data readable after $label");
+
+ check_shmem_usage($session1, "$label (session 1)", $node);
+ check_shmem_usage($session2, "$label (session 2)", $node);
+
+ $session1->quit;
+ $session2->quit;
+
+ return ($new_num_entries, $new_value);
+}
+
+# Run the full suite of resizable shared memory tests on the given node.
+sub run_resizable_tests
+{
+ my ($node, $initial_entries, $max_entries, $prefix) = @_;
+
+ my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+ my $num_entries = $initial_entries;
+
+ # Basic read/write should work on all platforms
+ my $value = 100;
+ $node->safe_psql('postgres', "SELECT resizable_shmem_write($value);");
+ is($node->safe_psql('postgres', "SELECT resizable_shmem_read($num_entries, $value);"),
+ 't', "$prefix: data read after write successful");
+
+ if ($have_resizable_shmem)
+ {
+ # Initial structure state
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $value = 100;
+ # Write and read the initial set of entries.
+ $session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+ is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);",
+ verbose => 0),
+ 't', "$prefix: data read after write successful");
+ check_shmem_usage($session1, "$prefix: initial write (session 1)", $node);
+ check_shmem_usage($session2, "$prefix: initial write (session 2)", $node);
+ $session1->quit;
+ $session2->quit;
+
+ # Verify no other structure is resizable
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> 0;"),
+ '0', "$prefix: no other resizable structures");
+
+ # Resize to maximum
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $max_entries, 500, 'resize to maximum');
+
+ # Shrink to 75% of max
+ my $shrink_entries = int($max_entries * 3 / 4);
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $shrink_entries, 999, 'shrinking');
+
+ # Resize to the same size (no-op)
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $num_entries, 1999, 'no-op resize');
+
+ # Test resize failure (attempt to resize beyond max - should fail)
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ ok($ret != 0 || $stderr =~ /ERROR/, "$prefix: Resize beyond maximum fails");
+ }
+ else
+ {
+ # On unsupported platforms, resizing should fail with a clear error
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize($num_entries);");
+ ok($ret != 0, "$prefix: resize fails on unsupported platform");
+ like($stderr, qr/not supported/, "$prefix: resize error mentions not supported");
+ }
+}
+
+### Set up a test node.
+#
+#Configure minimal shared memory so that the resizable_shmem structure dominates
+#and any unexpected increase is easy to detect.
+#
+# Also disable huge pages so that RssShmem and allocated_size are comparable.
+# The latter is already aligned to the default page size.
+###
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+$node->init;
+
+$node->append_conf('postgresql.conf', 'huge_pages = off');
+$node->append_conf('postgresql.conf', 'shared_buffers = 128kB');
+$node->append_conf('postgresql.conf', 'max_connections = 5');
+$node->append_conf('postgresql.conf', 'max_worker_processes = 0');
+$node->append_conf('postgresql.conf', 'max_wal_senders = 0');
+$node->append_conf('postgresql.conf', 'max_prepared_transactions = 0');
+$node->append_conf('postgresql.conf', 'max_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'max_pred_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'wal_buffers = 32kB');
+
+###
+# Test 1: Startup allocation via shared_preload_libraries
+###
+my $startup_initial = 25 * 1024 * 1024;
+my $startup_max = 100 * 1024 * 1024;
+
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->append_conf('postgresql.conf', "resizable_shmem.initial_entries = $startup_initial");
+$node->append_conf('postgresql.conf', "resizable_shmem.max_entries = $startup_max");
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $startup_initial, $startup_max, 'startup');
+
+my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+###
+# Test 2: Late allocation (loaded after startup, not in shared_preload_libraries).
+# Use much smaller sizes since only ~100KB of shared memory is available for
+# structures allocated after startup.
+###
+my $late_initial = 5 * 1024;
+my $late_max = 12 * 1024;
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM RESET shared_preload_libraries;
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $late_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $late_max;
+});
+$node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+$node->restart;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $late_initial, $late_max, 'late');
+
+###
+# Test sysv shared memory does not support resizable shmem. Only relevant on
+# platforms that support resizable shmem (HAVE_RESIZABLE_SHMEM), since the
+# module only sets maximum_size in that case.
+###
+if ($have_resizable_shmem)
+{
+ ###
+ # Test 3: Verify that CREATE EXTENSION fails with sysv shared memory
+ # when loaded after startup (not in shared_preload_libraries).
+ ###
+ $node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+
+ # Remove settings that would cause the library to auto-load at startup:
+ # shared_preload_libraries and module-prefixed GUCs. ALTER SYSTEM RESET
+ # only affects postgresql.auto.conf, so we must use adjust_conf to remove
+ # from postgresql.conf.
+ $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.max_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.max_entries', undef);
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_memory_type = 'sysv';
+ });
+
+ $node->restart;
+
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+ ok($ret != 0, 'CREATE EXTENSION fails with resizable shmem on sysv');
+ like($stderr, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'CREATE EXTENSION error mentions shared_memory_type = mmap requirement');
+
+ ###
+ # Test 4: Verify that resizable structures are also rejected with sysv
+ # shared memory when loaded at startup via shared_preload_libraries.
+ ###
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_preload_libraries = 'resizable_shmem';
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $startup_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $startup_max;
+ });
+ $node->stop;
+
+ ok(!$node->start(fail_ok => 1),
+ 'server fails to start with resizable shmem on sysv');
+
+ my $log = slurp_file($node->logfile);
+ like($log, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'log mentions shared_memory_type = mmap requirement');
+}
+
+done_testing();
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c154f57682a..c89b140871f 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -45,5 +45,28 @@ else
ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
}
+###
+# Test that a fixed-size shared memory structure cannot be resized.
+# Only relevant on platforms that support resizable shmem.
+###
+my $have_resizable_shmem =
+ $node->safe_psql('postgres', 'SHOW have_resizable_shmem;') eq 'on';
+
+if ($have_resizable_shmem)
+{
+ # Try expanding the fixed-size structure
+ my ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1000);");
+ isnt($ret, 0, "expanding a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "expand error message mentions not resizable");
+
+ # Try shrinking the fixed-size structure
+ ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1);");
+ isnt($ret, 0, "shrinking a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
+}
+
$node->stop;
+
done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
index 2d01fd9256c..e169d0d7733 100644
--- a/src/test/modules/test_shmem/test_shmem--1.0.sql
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -7,3 +7,7 @@
CREATE FUNCTION get_test_shmem_attach_count()
RETURNS pg_catalog.int4 STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION test_shmem_resize_fixed(pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 9bd4012b435..fc2fd67887f 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -99,3 +99,23 @@ get_test_shmem_attach_count(PG_FUNCTION_ARGS)
elog(ERROR, "shmem area not yet initialized");
PG_RETURN_INT32(TestShmem->attach_count);
}
+
+/*
+ * Attempt to resize the fixed-size shared memory structure. This should
+ * fail because the structure was not allocated with a maximum_size.
+ */
+PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
+Datum
+test_shmem_resize_fixed(PG_FUNCTION_ARGS)
+{
+#ifdef HAVE_RESIZABLE_SHMEM
+ int32 new_size = PG_GETARG_INT32(0);
+
+ ShmemResizeStruct("test_shmem area", new_size);
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#endif
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81a73c426d2..2bbbf48c96a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,10 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ maximum_size,
+ reserved_space
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size, reserved_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e9430e07b36..c079bab4cf0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3147,6 +3147,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestResizableShmemStruct
TestShmemData
TestSpec
TestValueType
base-commit: ed71d7356e3b394f579db87782a41e3d5dfb99ad
--
2.34.1
[text/x-patch] v20260406-0005-Add-minimum_size-specification-for-resizab.patch (19.3K, 3-v20260406-0005-Add-minimum_size-specification-for-resizab.patch)
download | inline diff:
From 6923334bec327bb9cc089d7962bebc73df0afef6 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 16:46:33 +0530
Subject: [PATCH v20260406 5/6] Add minimum_size specification for resizable
shared memory structures
Optional minimum size specification for resizable shared memory structures
allows us to enforce that a resizable structure cannot be shrunk below a certain
size. For resizable structures, the minimum size should be less than or equal
to the initial size specified in ShmemRequestStructOpts::size. If not specified,
the minimum size defaults to 0 for resizable structures. For fixed-size
structures, the minimum size and maximum size are set to the initial size
specified in ShmemRequestStructOpts::size.
This makes maximum size and minimum size reported in pg_shmem_allocations view
consistent for both fixed-size and resizable structures.
Author: Ashutosh Bapat <[email protected]>
---
doc/src/sgml/system-views.sgml | 16 ++-
doc/src/sgml/xfunc.sgml | 8 +-
src/backend/storage/ipc/shmem.c | 110 ++++++++++++++----
src/include/catalog/pg_proc.dat | 4 +-
src/include/storage/shmem.h | 6 +
.../modules/resizable_shmem/resizable_shmem.c | 3 +
.../resizable_shmem/t/001_resizable_shmem.pl | 2 +-
.../test_shmem/t/001_late_shmem_alloc.pl | 8 ++
src/test/regress/expected/rules.out | 3 +-
9 files changed, 129 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9717f8434bb..9bbbfdb37c5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4250,13 +4250,25 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>minimum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Minimum size in bytes that the resizable allocation can shrink to. Equals
+ <structfield>size</structfield>For fixed-size allocations, anonymous
+ allocations, and free memory.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>maximum_size</structfield> <type>int8</type>
</para>
<para>
- Maximum size in bytes that the resizable allocation can grow to. Zero for
- fixed-size allocations, for anonymous allocations, and for free memory.
+ Maximum size in bytes that the resizable allocation can grow to. Equals
+ <structfield>size</structfield> For fixed-size allocations, anonymous
+ allocations, and free memory.
</para></entry>
</row>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 3d25139c334..a6c7b8b1b22 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3753,7 +3753,9 @@ my_shmem_init(void *arg)
<parameter>.maximum_size</parameter> along with
<parameter>.size</parameter>. <parameter>.maximum_size</parameter> is
maximum size upto which the structure can grow where as
- <parameter>.size</parameter> is the initial size of the structure. While
+ <parameter>.size</parameter> is the initial size of the structure.
+ Optionally, <parameter>.minimum_size</parameter> can be set to the minimum
+ size that the structure can shrink to. While
contiguous address space worth <parameter>maximum_size</parameter> is
allocated to the structure, only memory worth <parameter>size</parameter>
bytes is allocated initially. The <function>init_fn</function> should only
@@ -3767,8 +3769,8 @@ my_shmem_init(void *arg)
<para>
The structure can be resized using <function>ShmemResizeStruct</function> by
passing it the structure's <structname>ShmemStructDesc</structname> and the
- new size which can be anywhere between 0 to
- <parameter>maximum_size</parameter>. If the new size is smaller than the
+ new size which can be anywhere between <parameter>minimum_size</parameter>
+ and <parameter>maximum_size</parameter>. If the new size is smaller than the
current size of the structure, the memory between the new size and current
size is freed while keeping the contents of the memory upto new size intact.
If the new size is greater than the current size, memory is allocated upto
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 4a3e8a8769e..115c543d36a 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -111,8 +111,11 @@
* startup, but memory is allocated or freed as the structure grows or shrinks
* respectively. ShmemRequestStructOpts::size should be set to the initial size
* of the structure, which is the amount of memory allocated at the startup.
- * After startup, the structure can be resized by calling ShmemResizeStruct() by
- * passing it the ShmemStructDesc for the structure and the new size.
+ * Optionally, ShmemRequestStructOpts::minimum_size can be set to the minimum
+ * size that the structure can shrink to. After startup, the structure can be
+ * resized by calling ShmemResizeStruct() by passing it the ShmemStructDesc for
+ * the structure and the new size. ShmemResizeStruct() enforces that the new
+ * size is within [minimum_size, maximum_size].
*
* While resizable structures can be created after the startup, the memory
* available for them is quite limited.
@@ -294,6 +297,8 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+ Size minimum_size; /* the minimum size the structure can shrink
+ * to */
Size maximum_size; /* the maximum size the structure can grow to */
Size reserved_space; /* the total address space reserved */
} ShmemIndexEnt;
@@ -383,6 +388,9 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size < 0 && options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
options->maximum_size, options->name);
@@ -394,6 +402,11 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->minimum_size < 0)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
if (options->maximum_size < 0)
@@ -405,10 +418,20 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
+ if (options->minimum_size > 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
+ options->minimum_size > options->size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to size (%zd)",
+ options->name, options->minimum_size, options->size);
+
if (options->maximum_size > 0 && options->size > options->maximum_size)
elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
options->name, options->maximum_size, options->size);
+ if (options->minimum_size > 0 && options->maximum_size > 0 &&
+ options->minimum_size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to maximum size (%zd)",
+ options->name, options->minimum_size, options->maximum_size);
+
/* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -614,12 +637,19 @@ InitShmemIndexEntry(ShmemRequest *request)
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
index_entry->reserved_space = allocated_size;
- index_entry->maximum_size = request->options->maximum_size;
if (request->options->maximum_size > 0)
{
+ index_entry->minimum_size = request->options->minimum_size;
+ index_entry->maximum_size = request->options->maximum_size;
+
/* Adjust allocated size of a resizable structure. */
index_entry->allocated_size = EstimateAllocatedSize(index_entry);
}
+ else
+ {
+ index_entry->minimum_size = request->options->size;
+ index_entry->maximum_size = request->options->size;
+ }
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -674,14 +704,38 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
- if (index_entry->maximum_size != request->options->maximum_size &&
- request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ /*
+ * For resizable structures, also check that minimum_size and maximum_size
+ * match. For fixed-size structures, these are derived (set to size) in
+ * the index entry and not meaningful in the request.
+ */
+ if (request->options->maximum_size != 0)
{
- ereport(ERROR,
- (errmsg("shared memory struct \"%s\" was created with" \
- " different maximum_size: existing %zu, requested %zu",
- name, index_entry->maximum_size,
- request->options->maximum_size)));
+ if (index_entry->minimum_size != request->options->minimum_size &&
+ request->options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different minimum_size: existing %zu, requested %zu",
+ name, index_entry->minimum_size,
+ request->options->minimum_size)));
+ }
+
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+ }
+ else
+ {
+ if (index_entry->minimum_size != index_entry->maximum_size)
+ elog(ERROR, "shared memory struct \"%s\" was created as resizable, but requested as fixed-size",
+ name);
}
/*
@@ -740,10 +794,11 @@ EstimateAllocatedSize(ShmemIndexEnt *entry)
/*
* ShmemResizeStruct() --- resize a resizable shared memory structure.
*
- * If the structure is being shrunk, the memory pages that are no longer needed
- * are freed. If the structure is being expanded, the memory pages that are
- * needed for the new size are allocated. See EstimateAllocatedSize() for
- * explanation of which pages are allocated for a resizable structure.
+ * The new size must be within [minimum_size, maximum_size]. If the structure
+ * is being shrunk, the memory pages that are no longer needed are freed. If
+ * the structure is being expanded, the memory pages that are needed for the
+ * new size are allocated. See EstimateAllocatedSize() for explanation of which
+ * pages are allocated for a resizable structure.
*/
void
ShmemResizeStruct(const char *name, Size new_size)
@@ -776,15 +831,22 @@ ShmemResizeStruct(const char *name, Size new_size)
Assert(result);
- if (result->maximum_size <= 0)
+ if (result->minimum_size == result->maximum_size)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("shared memory struct \"%s\" is not resizable", name)));
+ if (new_size < result->minimum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("cannot shrink shared memory structure \"%s\" below minimum size"
+ " (requested %zu bytes, minimum %zu bytes)",
+ name, new_size, result->minimum_size)));
+
if (result->maximum_size < new_size)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_RESOURCES),
- errmsg("not enough address space is reserved for resizing structure \"%s\"" \
+ errmsg("not enough address space is reserved for resizing structure \"%s\""
"(required %zu bytes, reserved %zu bytes)",
name, new_size, result->maximum_size)));
@@ -925,7 +987,8 @@ InitShmemAllocator(PGShmemHeader *seghdr)
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
#ifdef HAVE_RESIZABLE_SHMEM
- result->maximum_size = 0;
+ result->minimum_size = result->size;
+ result->maximum_size = result->size;
result->reserved_space = result->allocated_size;
#endif
result->location = ShmemAllocator->index;
@@ -1271,7 +1334,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 6
+#define PG_GET_SHMEM_SIZES_COLS 7
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1293,8 +1356,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
- values[4] = Int64GetDatum(ent->maximum_size);
- values[5] = Int64GetDatum(ent->reserved_space);
+ values[4] = Int64GetDatum(ent->minimum_size);
+ values[5] = Int64GetDatum(ent->maximum_size);
+ values[6] = Int64GetDatum(ent->reserved_space);
/*
* Keep track of the total reserved space for named shmem areas, to be
@@ -1313,8 +1377,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
- values[4] = Int64GetDatum(0);
+ values[4] = values[2];
values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1323,8 +1388,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
- values[4] = Int64GetDatum(0);
+ values[4] = values[2];
values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 32945d73f36..5ff6c543305 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8702,8 +8702,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o}',
- proargnames => '{name,off,size,allocated_size,maximum_size,reserved_space}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,minimum_size,maximum_size,reserved_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 8140a0255ae..0e6d5a63f28 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,12 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+ /*
+ * Minimum size this structure can shrink to. Should be set to 0 for
+ * fixed-size structures.
+ */
+ ssize_t minimum_size;
+
/*
* Maximum size this structure can grow upto in future. The memory is not
* allocated right away but the corresponding address space is reserved so
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index fb3dfd64b4b..d035d767a62 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -144,14 +144,17 @@ resizable_shmem_request(void *arg)
#ifdef HAVE_RESIZABLE_SHMEM
Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
mul_size(test_max_entries, TEST_ENTRY_SIZE));
+ Size min_size = offsetof(TestResizableShmemStruct, data);
#else
Size max_size = 0;
+ Size min_size = 0;
#endif
/* Register our resizable shared memory structure */
ShmemRequestStruct(.name = "resizable_shmem",
.size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+ .minimum_size = min_size,
.maximum_size = max_size,
.ptr = (void **) &resizable_shmem,
);
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
index a172cd0fd19..6a00ae0a194 100644
--- a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -96,7 +96,7 @@ sub run_resizable_tests
$session2->quit;
# Verify no other structure is resizable
- is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> 0;"),
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> minimum_size;"),
'0', "$prefix: no other resizable structures");
# Resize to maximum
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c89b140871f..472d4b121ae 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -67,6 +67,14 @@ if ($have_resizable_shmem)
like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
}
+###
+# Test that minimum_size and maximum_size equal size for a fixed-size structure
+# in pg_shmem_allocations.
+###
+is($node->safe_psql('postgres',
+ "SELECT minimum_size = size AND maximum_size = size FROM pg_shmem_allocations WHERE name = 'test_shmem area';"),
+ 't', "fixed-size structure has minimum_size = maximum_size = size");
+
$node->stop;
done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2bbbf48c96a..c42eb9c67a1 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1771,9 +1771,10 @@ pg_shmem_allocations| SELECT name,
off,
size,
allocated_size,
+ minimum_size,
maximum_size,
reserved_space
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size, reserved_space);
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, minimum_size, maximum_size, reserved_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
--
2.34.1
[text/x-patch] v20260406-0004-Avoid-creating-ABI-incompatibility-because.patch (14.4K, 4-v20260406-0004-Avoid-creating-ABI-incompatibility-because.patch)
download | inline diff:
From 45696d8f2371cceca1e38719ef49036cdd67c250 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 13:00:00 +0530
Subject: [PATCH v20260406 4/6] Avoid creating ABI incompatibility because of
HAVE_RESIZABLE_SHMEM
Per the suggestion at https://www.postgresql.org/message-id/CAEze2Wjn2cpQEPwzLajc0XdcMy8T=d1AWjg3UAmMUT2TmHkQkA@mail.gmail.com
---
src/backend/port/sysv_shmem.c | 14 +++++-
src/backend/port/win32_shmem.c | 22 +++++++++
src/backend/storage/ipc/shmem.c | 48 +++++++------------
src/include/storage/pg_shmem.h | 2 -
src/include/storage/shmem.h | 5 --
.../modules/resizable_shmem/resizable_shmem.c | 22 +++++----
src/test/modules/test_shmem/test_shmem.c | 6 ---
7 files changed, 63 insertions(+), 56 deletions(-)
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 8d859dfbbfb..bb2a81417c6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -1013,7 +1013,6 @@ PGSharedMemoryDetach(void)
}
}
-#ifdef HAVE_RESIZABLE_SHMEM
/*
* Make sure that the memory of given size from the given address is released.
*
@@ -1024,6 +1023,11 @@ PGSharedMemoryDetach(void)
void
PGSharedMemoryEnsureFreed(void *addr, Size size)
{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
if (!AnonymousShmem)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1035,6 +1039,7 @@ PGSharedMemoryEnsureFreed(void *addr, Size size)
if (madvise(addr, size, MADV_REMOVE) == -1)
ereport(ERROR,
(errmsg("could not free shared memory: %m")));
+#endif
}
/*
@@ -1047,6 +1052,11 @@ PGSharedMemoryEnsureFreed(void *addr, Size size)
void
PGSharedMemoryEnsureAllocated(void *addr, Size size)
{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
if (!AnonymousShmem)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1058,5 +1068,5 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
ereport(ERROR,
(errmsg("could not allocate shared memory: %m")));
+#endif
}
-#endif /* HAVE_RESIZABLE_SHMEM */
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index dc2ee018845..c1f30665e66 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -671,3 +671,25 @@ GetOSPageSize(void)
return os_page_size;
}
+
+/*
+ * PGSharedMemoryEnsureFreed / PGSharedMemoryEnsureAllocated
+ *
+ * Not supported on Windows. These are only meaningful on platforms with
+ * resizable shared memory (mmap + madvise).
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
+
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 03de5d88d51..4a3e8a8769e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -185,14 +185,12 @@ typedef struct
/*
* A convenient macro to get the space required for a shmem request consistently.
* A resizable structure, requested by non-zero maximum_size, requires space for
- * its maximum size.
+ * its maximum size. Please note that on the platforms that do not support
+ * resizable shmem, the maximum_size is ensured to be 0 i.e. all the structures
+ * are treated as fixed-size structures.
*/
-#ifdef HAVE_RESIZABLE_SHMEM
#define SHMEM_REQUEST_SPACE_SIZE(request) \
((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
-#else
-#define SHMEM_REQUEST_SPACE_SIZE(request) ((request)->options->size)
-#endif
static List *pending_shmem_requests;
@@ -296,10 +294,8 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
-#ifdef HAVE_RESIZABLE_SHMEM
Size maximum_size; /* the maximum size the structure can grow to */
Size reserved_space; /* the total address space reserved */
-#endif
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -308,9 +304,7 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
-#ifdef HAVE_RESIZABLE_SHMEM
static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
-#endif
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -376,16 +370,22 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->name == NULL)
elog(ERROR, "shared memory request is missing 'name' option");
+#ifndef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0)
+ elog(ERROR, "resizable shared memory is not supported on this platform");
+#else
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
if (IsUnderPostmaster)
{
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
-#ifdef HAVE_RESIZABLE_SHMEM
if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
options->maximum_size, options->name);
-#endif
}
else
{
@@ -394,28 +394,21 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
-#ifdef HAVE_RESIZABLE_SHMEM
if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
if (options->maximum_size < 0)
elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
options->maximum_size, options->name);
-#endif
}
if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
-#ifdef HAVE_RESIZABLE_SHMEM
if (options->maximum_size > 0 && options->size > options->maximum_size)
elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
options->name, options->maximum_size, options->size);
- if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
- elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
-#endif
-
/* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -620,7 +613,6 @@ InitShmemIndexEntry(ShmemRequest *request)
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
-#ifdef HAVE_RESIZABLE_SHMEM
index_entry->reserved_space = allocated_size;
index_entry->maximum_size = request->options->maximum_size;
if (request->options->maximum_size > 0)
@@ -628,7 +620,6 @@ InitShmemIndexEntry(ShmemRequest *request)
/* Adjust allocated size of a resizable structure. */
index_entry->allocated_size = EstimateAllocatedSize(index_entry);
}
-#endif
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -683,7 +674,6 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
-#ifdef HAVE_RESIZABLE_SHMEM
if (index_entry->maximum_size != request->options->maximum_size &&
request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -693,7 +683,6 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->maximum_size,
request->options->maximum_size)));
}
-#endif
/*
* Re-establish the caller's pointer variable, or do other actions to
@@ -716,7 +705,6 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
-#ifdef HAVE_RESIZABLE_SHMEM
/*
* Estimate the actual memory allocated for a resizable structure.
*
@@ -760,6 +748,11 @@ EstimateAllocatedSize(ShmemIndexEnt *entry)
void
ShmemResizeStruct(const char *name, Size new_size)
{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
ShmemIndexEnt *result;
bool found;
Size page_size = GetOSPageSize();
@@ -822,8 +815,8 @@ ShmemResizeStruct(const char *name, Size new_size)
result->allocated_size = EstimateAllocatedSize(result);
LWLockRelease(ShmemIndexLock);
+#endif
}
-#endif /* HAVE_RESIZABLE_SHMEM */
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
@@ -1300,7 +1293,6 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
-#ifdef HAVE_RESIZABLE_SHMEM
values[4] = Int64GetDatum(ent->maximum_size);
values[5] = Int64GetDatum(ent->reserved_space);
@@ -1311,12 +1303,6 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
* the segment.
*/
named_allocated += ent->reserved_space;
-#else
- values[4] = Int64GetDatum(0);
- values[5] = Int64GetDatum(ent->allocated_size);
-
- named_allocated += ent->allocated_size;
-#endif
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 3d5aceba59c..f0efbf2aec1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,10 +89,8 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
-#ifdef HAVE_RESIZABLE_SHMEM
extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
-#endif
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
extern Size GetOSPageSize(void);
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index f356027e500..8140a0255ae 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,8 +57,6 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
-#ifdef HAVE_RESIZABLE_SHMEM
-
/*
* Maximum size this structure can grow upto in future. The memory is not
* allocated right away but the corresponding address space is reserved so
@@ -68,7 +66,6 @@ typedef struct ShmemStructOpts
* structures.
*/
ssize_t maximum_size;
-#endif
/*
* When the shmem area is initialized or attached to, pointer to it is
@@ -181,9 +178,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
-#ifdef HAVE_RESIZABLE_SHMEM
extern void ShmemResizeStruct(const char *name, Size new_size);
-#endif
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index 66754582b32..fb3dfd64b4b 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -135,20 +135,24 @@ resizable_shmem_request(void *arg)
{
Size initial_size = add_size(offsetof(TestResizableShmemStruct, data),
mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+
+/*
+ * Create resizable structure on the platforms which support it. Otherwise create
+ * as a fixed-size structure. Other way would be to conditionally include
+ * .maximum_size in the call to ShmemRequestStruct().
+ */
#ifdef HAVE_RESIZABLE_SHMEM
Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
mul_size(test_max_entries, TEST_ENTRY_SIZE));
- /* A preprocessor macro to conditionally include the maximum_size field. */
-#define MAXIMUM_SIZE_ARG .maximum_size = max_size,
#else
-#define MAXIMUM_SIZE_ARG
+ Size max_size = 0;
#endif
/* Register our resizable shared memory structure */
ShmemRequestStruct(.name = "resizable_shmem",
.size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
- MAXIMUM_SIZE_ARG
+ .maximum_size = max_size,
.ptr = (void **) &resizable_shmem,
);
}
@@ -172,11 +176,14 @@ resizable_shmem_shmem_init(void *arg)
/*
* Resize the shared memory structure to accommodate the specified number of
* entries.
+ *
+ * On the plaforms which do not support resizable shared memory,
+ * ShmemResizeStruct() will raise an error, so this function will fail if the
+ * caller tries to resize the structure.
*/
Datum
resizable_shmem_resize(PG_FUNCTION_ARGS)
{
-#ifdef HAVE_RESIZABLE_SHMEM
int32 new_entries = PG_GETARG_INT32(0);
Size new_size;
@@ -191,11 +198,6 @@ resizable_shmem_resize(PG_FUNCTION_ARGS)
resizable_shmem->num_entries = new_entries;
PG_RETURN_VOID();
-#else
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("resizable shared memory is not supported on this platform")));
-#endif
}
/*
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index fc2fd67887f..0dd469891ee 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -108,14 +108,8 @@ PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
Datum
test_shmem_resize_fixed(PG_FUNCTION_ARGS)
{
-#ifdef HAVE_RESIZABLE_SHMEM
int32 new_size = PG_GETARG_INT32(0);
ShmemResizeStruct("test_shmem area", new_size);
PG_RETURN_VOID();
-#else
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("resizable shared memory is not supported on this platform")));
-#endif
}
--
2.34.1
[text/x-patch] v20260406-0003-Add-more-diagnostics-about-shared-memory-s.patch (4.3K, 5-v20260406-0003-Add-more-diagnostics-about-shared-memory-s.patch)
download | inline diff:
From 1d1d85f1c36bb86b4648c5c5c3afb4b41b0f7c2a Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 10:58:29 +0530
Subject: [PATCH v20260406 3/6] Add more diagnostics about shared memory
segments
NOT FOR FINAL COMMIT
Log size, RSS and Swap of every shared memory segment mapped by the backend.
This will be useful to understand the failure on CFBot where we are seeing about
10MB extra shared memory allocated that expected.
---
.../modules/resizable_shmem/resizable_shmem.c | 93 +++++++++++++++++--
1 file changed, 84 insertions(+), 9 deletions(-)
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index 2063d05053f..66754582b32 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -254,8 +254,11 @@ resizable_shmem_read(PG_FUNCTION_ARGS)
* backend.
*
* The VMA containing our resizable_shmem pointer is used to determine the main
- * memory segment. RSS + Swap (in bytes) for that VMS from /proc/self/smaps is
+ * memory segment. RSS + Swap (in bytes) for that VMA from /proc/self/smaps is
* returned.
+ *
+ * As a side effect, all shared-memory VMAs are logged with their name, RSS,
+ * and Swap values for diagnostic purposes.
*/
Datum
resizable_shmem_usage(PG_FUNCTION_ARGS)
@@ -268,6 +271,14 @@ resizable_shmem_usage(PG_FUNCTION_ARGS)
bool in_target_vma = false;
size_t result;
+ /* State for logging shared VMAs */
+ bool in_shared_vma = false;
+ char vma_name[256];
+ char vma_range[64];
+ int64 vma_size_kb = -1;
+ int64 vma_rss_kb = -1;
+ int64 vma_swap_kb = -1;
+
f = fopen("/proc/self/smaps", "r");
if (f == NULL)
ereport(ERROR,
@@ -278,22 +289,86 @@ resizable_shmem_usage(PG_FUNCTION_ARGS)
{
unsigned long start;
unsigned long end;
+ char perms[5];
+ unsigned long offset;
+ char dev[12];
+ unsigned long inode;
+ char pathname[256];
+ int nfields;
- if (sscanf(line, "%lx-%lx", &start, &end) == 2)
+ nfields = sscanf(line, "%lx-%lx %4s %lx %11s %lu %255[^\n]",
+ &start, &end, perms, &offset, dev, &inode, pathname);
+
+ if (nfields >= 6)
{
+ /*
+ * We've hit a new VMA header. First, log the previous shared VMA
+ * if we were tracking one.
+ */
+ if (in_shared_vma)
+ elog(LOG, "shared VMA %s %s: Size=%ld kB, Rss=%ld kB, Swap=%ld kB",
+ vma_range, vma_name,
+ (long) (vma_size_kb >= 0 ? vma_size_kb : 0),
+ (long) (vma_rss_kb >= 0 ? vma_rss_kb : 0),
+ (long) (vma_swap_kb >= 0 ? vma_swap_kb : 0));
+
+ /* Check if this VMA is a shared mapping (has 's' in perms) */
+ in_shared_vma = (perms[3] == 's');
+ if (in_shared_vma)
+ {
+ snprintf(vma_range, sizeof(vma_range), "%lx-%lx", start, end);
+ if (nfields >= 7)
+ strlcpy(vma_name, pathname, sizeof(vma_name));
+ else
+ strlcpy(vma_name, "(anonymous)", sizeof(vma_name));
+ vma_size_kb = -1;
+ vma_rss_kb = -1;
+ vma_swap_kb = -1;
+ }
+
+ /* Track the target VMA for our return value */
in_target_vma = (target >= start && target < end);
+ if (in_target_vma)
+ {
+ rss_kb = -1;
+ swap_kb = -1;
+ }
}
- else if (in_target_vma)
+ else
{
- if (rss_kb == -1)
- sscanf(line, "Rss: %ld kB", &rss_kb);
- if (swap_kb == -1)
- sscanf(line, "Swap: %ld kB", &swap_kb);
- if (rss_kb >= 0 && swap_kb >= 0)
- break;
+ /* Parse detail lines for the current VMA */
+ int64 val;
+
+ if (sscanf(line, "Size: %ld kB", &val) == 1)
+ {
+ if (in_shared_vma && vma_size_kb == -1)
+ vma_size_kb = val;
+ }
+ else if (sscanf(line, "Rss: %ld kB", &val) == 1)
+ {
+ if (in_target_vma && rss_kb == -1)
+ rss_kb = val;
+ if (in_shared_vma && vma_rss_kb == -1)
+ vma_rss_kb = val;
+ }
+ else if (sscanf(line, "Swap: %ld kB", &val) == 1)
+ {
+ if (in_target_vma && swap_kb == -1)
+ swap_kb = val;
+ if (in_shared_vma && vma_swap_kb == -1)
+ vma_swap_kb = val;
+ }
}
}
+ /* Log the last shared VMA if any */
+ if (in_shared_vma)
+ elog(LOG, "shared VMA %s %s: Size=%ld kB, Rss=%ld kB, Swap=%ld kB",
+ vma_range, vma_name,
+ (long) (vma_size_kb >= 0 ? vma_size_kb : 0),
+ (long) (vma_rss_kb >= 0 ? vma_rss_kb : 0),
+ (long) (vma_swap_kb >= 0 ? vma_swap_kb : 0));
+
fclose(f);
result = rss_kb >= 0 ? mul_size(rss_kb, 1024) : 0;
--
2.34.1
[text/x-patch] v20260406-0002-Use-smaps-instead-of-status-in-resizable_s.patch (6.1K, 6-v20260406-0002-Use-smaps-instead-of-status-in-resizable_s.patch)
download | inline diff:
From a982fd9491c914c67d674a21e4ba0ac746807811 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 10:43:18 +0530
Subject: [PATCH v20260406 2/6] Use smaps instead of status in
resizable_shmem_used()
/proc/self/status gives memory usages across all the VMAs of a process.
/proc/self/smaps gives memory usages for each VMA separately. Hence use smaps to
accurately estimate the memory allocated in the main shared memory segment.
---
.../resizable_shmem/resizable_shmem--1.0.sql | 6 +-
.../modules/resizable_shmem/resizable_shmem.c | 86 ++++++++-----------
.../resizable_shmem/t/001_resizable_shmem.pl | 2 +-
3 files changed, 41 insertions(+), 53 deletions(-)
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
index c1bcb6117b6..b4b07336dc3 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -25,8 +25,10 @@ RETURNS boolean
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- Function to report memory usage statistics of the calling backend
-CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+-- Function to report memory mapped against the main shared memory segment in
+-- the backend where this function runs.
+CREATE FUNCTION resizable_shmem_usage()
+RETURNS bigint
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index 5ae2d2e2d1d..2063d05053f 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -10,19 +10,17 @@
*/
#include "postgres.h"
+#include <limits.h>
+#include <stdio.h>
+
#include "commands/extension.h"
#include "fmgr.h"
-#include "funcapi.h"
#include "miscadmin.h"
#include "storage/shmem.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/memutils.h"
-#include "utils/timestamp.h"
-#include "access/htup_details.h"
-
-#include <stdio.h>
PG_MODULE_MAGIC;
@@ -252,68 +250,56 @@ resizable_shmem_read(PG_FUNCTION_ARGS)
}
/*
- * Report multiple memory usage statistics of the calling backend process
- * as reported by the kernel.
- * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ * Return the memory mapped against the main shared memory segment in this
+ * backend.
*
- * The function assumes that these values will be available in
- * /proc/self/status, any system which also support madvise with MADV_REMOVE and
- * MADV_POPULATE_WRITE.
+ * The VMA containing our resizable_shmem pointer is used to determine the main
+ * memory segment. RSS + Swap (in bytes) for that VMS from /proc/self/smaps is
+ * returned.
*/
Datum
resizable_shmem_usage(PG_FUNCTION_ARGS)
{
FILE *f;
char line[256];
- int64 rss_anon_kb = -1;
- int64 rss_file_kb = -1;
- int64 rss_shmem_kb = -1;
- int64 vm_size_kb = -1;
- int found = 0;
- TupleDesc tupdesc;
- Datum values[4];
- bool nulls[4];
- HeapTuple tuple;
-
- /* Open /proc/self/status to read memory information */
- f = fopen("/proc/self/status", "r");
+ int64 rss_kb = -1;
+ int64 swap_kb = -1;
+ uintptr_t target = (uintptr_t) resizable_shmem;
+ bool in_target_vma = false;
+ size_t result;
+
+ f = fopen("/proc/self/smaps", "r");
if (f == NULL)
ereport(ERROR,
(errcode_for_file_access(),
- errmsg("could not open /proc/self/status: %m")));
+ errmsg("could not open /proc/self/smaps: %m")));
- /* Look for the memory usage lines */
- while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ while (fgets(line, sizeof(line), f) != NULL)
{
- if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
- found++;
- else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
- found++;
- else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
- found++;
- else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
- found++;
+ unsigned long start;
+ unsigned long end;
+
+ if (sscanf(line, "%lx-%lx", &start, &end) == 2)
+ {
+ in_target_vma = (target >= start && target < end);
+ }
+ else if (in_target_vma)
+ {
+ if (rss_kb == -1)
+ sscanf(line, "Rss: %ld kB", &rss_kb);
+ if (swap_kb == -1)
+ sscanf(line, "Swap: %ld kB", &swap_kb);
+ if (rss_kb >= 0 && swap_kb >= 0)
+ break;
+ }
}
fclose(f);
- /* Build tuple descriptor for our result type */
- if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("function returning record called in context "
- "that cannot accept a record")));
-
- /* Build the result tuple */
- values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
- values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
- values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
- values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
-
- nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+ result = rss_kb >= 0 ? mul_size(rss_kb, 1024) : 0;
+ result = add_size(result, swap_kb >= 0 ? mul_size(swap_kb, 1024) : 0);
- tuple = heap_form_tuple(tupdesc, values, nulls);
- PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+ PG_RETURN_INT64(result);
}
/*
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
index 6d45b1eccdc..a172cd0fd19 100644
--- a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -19,7 +19,7 @@ sub check_shmem_usage
{
my ($session, $label, $node) = @_;
- my $rss_shmem = $session->query_safe('SELECT rss_shmem FROM resizable_shmem_usage();',
+ my $rss_shmem = $session->query_safe('SELECT resizable_shmem_usage();',
verbose => 0);
my $total_alloc = $node->safe_psql('postgres',
"SELECT sum(allocated_size) FROM pg_shmem_allocations;");
--
2.34.1
[text/x-patch] v20260406-0006-Add-support-to-protect-unused-resizable_sh.patch (11.8K, 7-v20260406-0006-Add-support-to-protect-unused-resizable_sh.patch)
download | inline diff:
From ef321b327e7ff0a9abc1eb34f11910d7b71d32dd Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 19:00:14 +0530
Subject: [PATCH v20260406 6/6] Add support to protect unused resizable_shmem
structure
Add APIs to make the portion of resizable_shmem structure beyond its current
size inaccessible.
Author: Ashutosh Bapat <[email protected]>
Suggested-by: Matthias van de Meent <[email protected]>
---
doc/src/sgml/xfunc.sgml | 13 +++-
src/backend/port/sysv_shmem.c | 55 +++++++++++++++++
src/backend/port/win32_shmem.c | 8 +++
src/backend/storage/ipc/shmem.c | 60 +++++++++++++++++++
src/include/storage/pg_shmem.h | 1 +
src/include/storage/shmem.h | 1 +
.../modules/resizable_shmem/resizable_shmem.c | 51 ++++++++++++++++
7 files changed, 186 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index a6c7b8b1b22..62b33366f30 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3781,9 +3781,16 @@ my_shmem_init(void *arg)
shared structure. Also accessing the memory beyond the current size of the
structure will not cause any segmentation fault or a bus error. Memory will
be allocated during such a write access. 0s will be returned on such a read
- access if memory is not allocated yet. The additional synchronization may
- use mprotect() with PROT_NONE in every backend that may access this memory
- to ensure that such an access results in a fault.
+ access if memory is not allocated yet.
+ </para>
+
+ <para>
+ <function>ShmemProtectStruct</function> can be called when resizing the
+ structure to make the unused portion of the structure inaccessible and the
+ used portion accessible. These protections work only at the memory page
+ level, so some unused portion may still remain accessible. Please note that
+ the function modifies the protections only in the backend where it is run.
+ It needs to be called from every backend that may access the structure.
</para>
<para>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index bb2a81417c6..14b6fa7f7e6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -1065,8 +1065,63 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ /*
+ * Ensure that MADV_POPULATE_WRITE can initialize the newly allocated
+ * pages.
+ */
+ if (mprotect(addr, size, PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+
if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
ereport(ERROR,
(errmsg("could not allocate shared memory: %m")));
#endif
}
+
+/*
+ * Set memory protection on the given region of shared memory.
+ *
+ * Makes [rw_start, rw_end) readable and writable, and [rw_end, prot_end)
+ * inaccessible.
+ *
+ * All addresses are expected to be page aligned.
+ *
+ * Only supported on platforms that support resizable shared memory.
+ */
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be protected at runtime")));
+
+ Assert(rw_start == (void *) TYPEALIGN(GetOSPageSize(), rw_start));
+ Assert(rw_end == (void *) TYPEALIGN(GetOSPageSize(), rw_end));
+ Assert(prot_end == (void *) TYPEALIGN(GetOSPageSize(), prot_end));
+ Assert(rw_end >= rw_start);
+
+ if (rw_end > rw_start)
+ {
+ if (mprotect(rw_start, (char *) rw_end - (char *) rw_start,
+ PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+
+ if (prot_end > rw_end)
+ {
+ if (mprotect(rw_end, (char *) prot_end - (char *) rw_end,
+ PROT_NONE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+#endif
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index c1f30665e66..b5396e4a5e8 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -693,3 +693,11 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("resizable shared memory is not supported on this platform")));
}
+
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 115c543d36a..a3ed082e4d9 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -880,6 +880,66 @@ ShmemResizeStruct(const char *name, Size new_size)
#endif
}
+/*
+ * ShmemProtectStruct() --- protect the unused portion of a resizable structure.
+ *
+ * Makes the region beyond the current size up to maximum_size inaccessible, and
+ * ensures the region up to the current size is readable and writable. Depending
+ * upon the platform, the protection honours the page boundaries. So it may be
+ * more permissible than strictly needed.
+ *
+ * Only works for resizable structures. Should be called in every backend that
+ * may access the resizable structure while resizing it.
+ */
+void
+ShmemProtectStruct(const char *name)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *rw_start;
+ char *rw_end;
+ char *prot_end;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ if (result->minimum_size == result->maximum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ /* Resizable structures are only supported with mmap-based shared memory. */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* Make at least [location, location+size) readable and writable */
+ rw_start = (char *) TYPEALIGN_DOWN(page_size, result->location);
+ rw_end = (char *) TYPEALIGN(page_size,
+ (char *) result->location + result->size);
+
+ /*
+ * Make remaining portion inaccessible while making sure that the portion
+ * after maximum_size is not affected since it may be used by other
+ * structures.
+ */
+ prot_end = (char *) TYPEALIGN_DOWN(page_size,
+ (char *) result->location + result->maximum_size);
+
+ LWLockRelease(ShmemIndexLock);
+
+ PGSharedMemoryProtect(rw_start, rw_end, prot_end);
+#endif
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index f0efbf2aec1..5165b815cc1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -91,6 +91,7 @@ extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+extern void PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
extern Size GetOSPageSize(void);
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 0e6d5a63f28..f8ddb0dd7c0 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -185,6 +185,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
extern void ShmemResizeStruct(const char *name, Size new_size);
+extern void ShmemProtectStruct(const char *name);
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index d035d767a62..c02ba54896a 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -56,10 +56,12 @@ static bool use_unknown_size = false;
static void resizable_shmem_request(void *arg);
static void resizable_shmem_shmem_init(void *arg);
+static void resizable_shmem_shmem_attach(void *arg);
static ShmemCallbacks shmem_callbacks = {
.request_fn = resizable_shmem_request,
.init_fn = resizable_shmem_shmem_init,
+ .attach_fn = resizable_shmem_shmem_attach,
};
/* SQL-callable functions */
@@ -172,10 +174,32 @@ resizable_shmem_shmem_init(void *arg)
*/
Assert(resizable_shmem != NULL);
+#ifdef HAVE_RESIZABLE_SHMEM
+ /* Protect the shared memory structure in this backend. */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
resizable_shmem->num_entries = test_initial_entries;
memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
}
+/*
+ * Protect the shared memory structure memory after attaching.
+ */
+static void
+resizable_shmem_shmem_attach(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+#ifdef HAVE_RESIZABLE_SHMEM
+ ShmemProtectStruct("resizable_shmem");
+#endif
+}
+
/*
* Resize the shared memory structure to accommodate the specified number of
* entries.
@@ -198,6 +222,7 @@ resizable_shmem_resize(PG_FUNCTION_ARGS)
new_size = add_size(offsetof(TestResizableShmemStruct, data),
mul_size(new_entries, TEST_ENTRY_SIZE));
ShmemResizeStruct("resizable_shmem", new_size);
+ ShmemProtectStruct("resizable_shmem");
resizable_shmem->num_entries = new_entries;
PG_RETURN_VOID();
@@ -217,6 +242,19 @@ resizable_shmem_write(PG_FUNCTION_ARGS)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("resizable_shmem is not initialized")));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
/* Write the value to all current entries */
for (i = 0; i < resizable_shmem->num_entries; i++)
resizable_shmem->data[i] = entry_value;
@@ -245,6 +283,19 @@ resizable_shmem_read(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
for (i = 0; i < entry_count; i++)
{
if (resizable_shmem->data[i] != entry_value)
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 10:06 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-07 10:06 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Mon, Apr 6, 2026 at 7:23 PM Ashutosh Bapat
<[email protected]> wrote:
>
> I have kept these two patches separate from the main patch so that I
> can remove them if others feel they are not worth including in the
> feature.
Here are patches rebased on the latest HEAD. No conflicts just rebase.
Here are differences from the previous patchset.
o. There are two patches in this patchset now. a. 0001 which supports
resizable shared memory and is equivalent to 0001 + 0002 + 0004 + 0005
from the previous patchset. b. 0002 which is 0006 from the previous
patchset and adds support for protecting resizable shared memory
structures. 0003, which added diagnostics to investigate CFBot
failure, from the previous patchset is not required anymore since all
tests pass with CFBot.
o. I have merged 0002 into 0001 from the previous patchset since with
that patch all platforms are green on CFBot. The resizable shared
memory test now uses /proc/self/smaps instead of /proc/self/status to
find the amount of memory allocated in the main shared memory segment
of PostgreSQL.
o. Merged 0004, which supported minimum_size, into 0001. Minimum_size
would be useful to protect against accidental shrinkage of the
resizable structures. It will help additional support for minimum
sizes of GUCs like shared_buffers. It also makes it easy and intuitive
to distinguish between fixed-size and resizable structures, and will
be useful to find the minimum size of the shared memory segment.
o. Merged 0005, which allows ABI compatibility between the binaries
which support resizable shared memory and those which don't, into
0001. Apart from ABI compatibility, the code has lesser #ifdef blocks
and thus easier to read and maintain.
I didn't find it useful to keep 0004 and 0005 separate since they were
interdependent and made review complicated and have higher chances of
being acceptable.
o. 0006 is still separate since I am not sure whether the
functionality is absolutely needed at this time. In an offlist
discussion, Andres mentioned that it is not strictly needed. The
subsystem that uses the resizable shared memory can implement their
own protection if required and integrate it in the subsystems specific
synchronization. But Matthias thinks different. The API to add
protection is platform dependent, so it's better to abstract it via
shmem.c. If we decide to accept this patch, we should merge it into
0001 before committing.
Also did some more cleanups and changed the name of the GUC
have_resizable_shmem to have_resizable_shared_memory since shmem is an
internal phrase.
I am looking at merging the resizable_shmem module into test_shmem module next.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] v20260407-0002-Add-support-to-protect-unused-resizable_sh.patch (11.2K, 2-v20260407-0002-Add-support-to-protect-unused-resizable_sh.patch)
download | inline diff:
From 93515761c30f84125b2e115886be73eb830d57dc Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 19:00:14 +0530
Subject: [PATCH v20260407 2/2] Add support to protect unused resizable_shmem
structure
Add APIs to make the portion of resizable_shmem structure beyond its current
size inaccessible.
Author: Ashutosh Bapat <[email protected]>
Suggested-by: Matthias van de Meent <[email protected]>
---
doc/src/sgml/xfunc.sgml | 4 +-
src/backend/port/sysv_shmem.c | 55 +++++++++++++++++
src/backend/port/win32_shmem.c | 8 +++
src/backend/storage/ipc/shmem.c | 60 +++++++++++++++++++
src/include/storage/pg_shmem.h | 1 +
src/include/storage/shmem.h | 1 +
.../modules/resizable_shmem/resizable_shmem.c | 51 ++++++++++++++++
7 files changed, 179 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 22f953db9d7..04b312fcf94 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3780,7 +3780,9 @@ my_shmem_init(void *arg)
additional synchronization between the resizing process and the processes
using the shared structure. Also it needs to implement additional protection
to prevent access to the part of the address space beyond the size of the
- structure when resizing it.
+ structure when resizing it. <function>ShmemProtectStruct</function> can be
+ called from every backend that may access the resizable structure for the
+ same.
</para>
<para>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index bb2a81417c6..14b6fa7f7e6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -1065,8 +1065,63 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ /*
+ * Ensure that MADV_POPULATE_WRITE can initialize the newly allocated
+ * pages.
+ */
+ if (mprotect(addr, size, PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+
if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
ereport(ERROR,
(errmsg("could not allocate shared memory: %m")));
#endif
}
+
+/*
+ * Set memory protection on the given region of shared memory.
+ *
+ * Makes [rw_start, rw_end) readable and writable, and [rw_end, prot_end)
+ * inaccessible.
+ *
+ * All addresses are expected to be page aligned.
+ *
+ * Only supported on platforms that support resizable shared memory.
+ */
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be protected at runtime")));
+
+ Assert(rw_start == (void *) TYPEALIGN(GetOSPageSize(), rw_start));
+ Assert(rw_end == (void *) TYPEALIGN(GetOSPageSize(), rw_end));
+ Assert(prot_end == (void *) TYPEALIGN(GetOSPageSize(), prot_end));
+ Assert(rw_end >= rw_start);
+
+ if (rw_end > rw_start)
+ {
+ if (mprotect(rw_start, (char *) rw_end - (char *) rw_start,
+ PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+
+ if (prot_end > rw_end)
+ {
+ if (mprotect(rw_end, (char *) prot_end - (char *) rw_end,
+ PROT_NONE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+#endif
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index c1f30665e66..b5396e4a5e8 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -693,3 +693,11 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("resizable shared memory is not supported on this platform")));
}
+
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 8f006967790..61808c7a8e5 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -875,6 +875,66 @@ ShmemResizeStruct(const char *name, Size new_size)
#endif
}
+/*
+ * ShmemProtectStruct() --- protect the unused portion of a resizable structure.
+ *
+ * Makes the region beyond the current size up to maximum_size inaccessible, and
+ * ensures the region up to the current size is readable and writable. Depending
+ * upon the platform, the protection honours the page boundaries. So it may be
+ * more permissible than strictly needed.
+ *
+ * Only works for resizable structures. Should be called in every backend that
+ * may access the resizable structure while resizing it.
+ */
+void
+ShmemProtectStruct(const char *name)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *rw_start;
+ char *rw_end;
+ char *prot_end;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ if (result->minimum_size == result->maximum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ /* Resizable structures are only supported with mmap-based shared memory. */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* Make at least [location, location+size) readable and writable */
+ rw_start = (char *) TYPEALIGN_DOWN(page_size, result->location);
+ rw_end = (char *) TYPEALIGN(page_size,
+ (char *) result->location + result->size);
+
+ /*
+ * Make remaining portion inaccessible while making sure that the portion
+ * after maximum_size is not affected since it may be used by other
+ * structures.
+ */
+ prot_end = (char *) TYPEALIGN_DOWN(page_size,
+ (char *) result->location + result->maximum_size);
+
+ LWLockRelease(ShmemIndexLock);
+
+ PGSharedMemoryProtect(rw_start, rw_end, prot_end);
+#endif
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index f0efbf2aec1..5165b815cc1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -91,6 +91,7 @@ extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+extern void PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
extern Size GetOSPageSize(void);
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 0e6d5a63f28..f8ddb0dd7c0 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -185,6 +185,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
extern void ShmemResizeStruct(const char *name, Size new_size);
+extern void ShmemProtectStruct(const char *name);
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
index 6fd9e02f7b3..2727a678c57 100644
--- a/src/test/modules/resizable_shmem/resizable_shmem.c
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -56,10 +56,12 @@ static bool use_unknown_size = false;
static void resizable_shmem_request(void *arg);
static void resizable_shmem_shmem_init(void *arg);
+static void resizable_shmem_shmem_attach(void *arg);
static ShmemCallbacks shmem_callbacks = {
.request_fn = resizable_shmem_request,
.init_fn = resizable_shmem_shmem_init,
+ .attach_fn = resizable_shmem_shmem_attach,
};
/* SQL-callable functions */
@@ -172,10 +174,32 @@ resizable_shmem_shmem_init(void *arg)
*/
Assert(resizable_shmem != NULL);
+#ifdef HAVE_RESIZABLE_SHMEM
+ /* Protect the shared memory structure in this backend. */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
resizable_shmem->num_entries = test_initial_entries;
memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
}
+/*
+ * Protect the shared memory structure memory after attaching.
+ */
+static void
+resizable_shmem_shmem_attach(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+#ifdef HAVE_RESIZABLE_SHMEM
+ ShmemProtectStruct("resizable_shmem");
+#endif
+}
+
/*
* Resize the shared memory structure to accommodate the specified number of
* entries.
@@ -198,6 +222,7 @@ resizable_shmem_resize(PG_FUNCTION_ARGS)
new_size = add_size(offsetof(TestResizableShmemStruct, data),
mul_size(new_entries, TEST_ENTRY_SIZE));
ShmemResizeStruct("resizable_shmem", new_size);
+ ShmemProtectStruct("resizable_shmem");
resizable_shmem->num_entries = new_entries;
PG_RETURN_VOID();
@@ -217,6 +242,19 @@ resizable_shmem_write(PG_FUNCTION_ARGS)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("resizable_shmem is not initialized")));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
/* Write the value to all current entries */
for (i = 0; i < resizable_shmem->num_entries; i++)
resizable_shmem->data[i] = entry_value;
@@ -245,6 +283,19 @@ resizable_shmem_read(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
for (i = 0; i < entry_count; i++)
{
if (resizable_shmem->data[i] != entry_value)
--
2.34.1
[text/x-patch] v20260407-0001-resizable-shared-memory-structures.patch (71.0K, 3-v20260407-0001-resizable-shared-memory-structures.patch)
download | inline diff:
From 368db3bdce7f08795ea1271ed02860366633b66e Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH v20260407 1/2] resizable shared memory structures
Resizable shared memory structures can be allocated by specifying a new
member ShmemStructOpts::maximum_size. At the startup or when the
structure is created, we reserve address space worth maximum_size in the
shared memory segment. It is expected that the subsystem which creates
the structure would initialize only the initial size worth of memory
when creating it. In an mmap'ed memory, this should allocate memory
worth the initial size. It should not allocate maximum_size worth of
memory initially. As the structure is resized using ShmemResizeStruct()
memory is freed or allocated in chunks of memory pages when shrinking
and expanding the structure respectively.
Resizable shared memory feature depends upon existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check existence of madvise
separately, since existence of the constants implies existence of the
function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since that's
largely used for Windows where the APIs to free and allocate memory from
and to a given address space are not known to the author right now.
Given that PostgreSQL is used widely on Linux, providing this feature on
Linux covers benefits most of its users. Once we figure out the required
Windows APIs, we will support this feature on Windows as well.
The feature is also not available when Sys-V shared memory is used even
on Linux since we do not know whether required Sys-V APIs exist; mostly
they don't. Since that combination is only available for development and
testing, not supporting the feature there isn't going to impact
PostgreSQL users.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not support
the feature. But we also have run time checks to disable this feature
when Sys-V shared memory is used. In order to know whether a given
instance of running server supports resizable structures, we have
introduced GUC have_resizable_shmem.
Author: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
---
configure.ac | 4 +
doc/src/sgml/config.sgml | 15 +
doc/src/sgml/system-views.sgml | 42 ++-
doc/src/sgml/xfunc.sgml | 54 +++
meson.build | 16 +
src/backend/port/sysv_shmem.c | 79 +++++
src/backend/port/win32_shmem.c | 45 +++
src/backend/storage/ipc/ipci.c | 11 +
src/backend/storage/ipc/shmem.c | 318 ++++++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/guc_tables.c | 7 +
src/include/catalog/pg_proc.dat | 4 +-
src/include/pg_config.h.in | 8 +
src/include/pg_config_manual.h | 9 +
src/include/storage/pg_shmem.h | 3 +
src/include/storage/shmem.h | 17 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 25 ++
src/test/modules/resizable_shmem/meson.build | 36 ++
.../resizable_shmem/resizable_shmem--1.0.sql | 39 +++
.../modules/resizable_shmem/resizable_shmem.c | 317 +++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 4 +
.../resizable_shmem/t/001_resizable_shmem.pl | 241 +++++++++++++
.../test_shmem/t/001_late_shmem_alloc.pl | 31 ++
.../modules/test_shmem/test_shmem--1.0.sql | 4 +
src/test/modules/test_shmem/test_shmem.c | 14 +
src/test/regress/expected/rules.out | 7 +-
src/tools/pgindent/typedefs.list | 1 +
29 files changed, 1323 insertions(+), 37 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/configure.ac b/configure.ac
index ff5dd64468e..7acd844ccb2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1895,6 +1895,10 @@ AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
+# Linux-specific madvise constants needed for resizable shared memory. See similar checks in meson.build for explanation of why these checks are here.
+AC_CHECK_DECLS([MADV_POPULATE_WRITE], [], [], [#include <sys/mman.h>])
+AC_CHECK_DECLS([MADV_REMOVE], [], [], [#include <sys/mman.h>])
+
AC_REPLACE_FUNCS(m4_normalize([
explicit_bzero
getopt
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3324d2d3c49..9f630b1a074 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -12138,6 +12138,21 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-have-resizable-shared-memory" xreflabel="have_resizable_shared_memory">
+ <term><varname>have_resizable_shared_memory</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>have_resizable_shared_memory</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with <literal>HAVE_RESIZABLE_SHMEM</literal> enabled and supports
+ <link linkend="xfunc-shared-addin-resizable">Resizable shared memory structures</link>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status">
<term><varname>huge_pages_status</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2ebec6928d5..9bbbfdb37c5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,46 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>minimum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Minimum size in bytes that the resizable allocation can shrink to. Equals
+ <structfield>size</structfield>For fixed-size allocations, anonymous
+ allocations, and free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the resizable allocation can grow to. Equals
+ <structfield>size</structfield> For fixed-size allocations, anonymous
+ allocations, and free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>reserved_space</structfield> <type>int8</type>
+ </para>
+ <para>
+ Address space reserved for the allocation in bytes. For resizable
+ structures, this is the total address space reserved to accommodate
+ growth up to <structfield>maximum_size</structfield>, and is greater
+ than or equal to <structfield>allocated_size</structfield>. For
+ fixed-size allocations, anonymous allocations, and free memory this
+ is same as <structfield>allocated_size</structfield>.
</para></entry>
</row>
</tbody>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 789cac9fcab..22f953db9d7 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3744,6 +3744,60 @@ my_shmem_init(void *arg)
</para>
</sect3>
+ <sect3 id="xfunc-shared-addin-resizable">
+ <title>Resizable shared memory structures</title>
+
+ <para>
+ A resizable memory structure can be requested using
+ <function>ShmemRequestStruct()</function> by passing
+ <parameter>maximum_size</parameter> along with
+ <parameter>size</parameter>. <parameter>maximum_size</parameter> is
+ maximum size upto which the structure can grow where as
+ <parameter>size</parameter> is the initial size of the structure.
+ Optionally, <parameter>minimum_size</parameter> can be set to the minimum
+ size that the structure can shrink to. While
+ contiguous address space worth <parameter>maximum_size</parameter> is
+ allocated to the structure, only memory worth <parameter>size</parameter>
+ bytes is allocated initially. The <function>init_fn</function> should only
+ initialize the <parameter>size</parameter> amount of memory. The actual
+ memory allocated to this structure at any point in time is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_size</structfield></link>
+ and the address space reserved for this structure is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>reserved_space</structfield></link>.
+ </para>
+
+ <para>
+ The structure can be resized using <function>ShmemResizeStruct()</function>
+ by passing it the structure's <parameter>name</parameter> and the new size
+ which can be anywhere between <parameter>minimum_size</parameter> and
+ <parameter>maximum_size</parameter>. If the new size is smaller than the
+ current size of the structure, the memory between the new size and current
+ size is freed while keeping the contents of the memory upto new size intact.
+ If the new size is greater than the current size, memory is allocated upto
+ new size while keeping the current contents of the structure intact. The
+ starting address of the structure does not change because of resizing
+ operation. The sybsystem using this feature needs to take care of the
+ additional synchronization between the resizing process and the processes
+ using the shared structure. Also it needs to implement additional protection
+ to prevent access to the part of the address space beyond the size of the
+ structure when resizing it.
+ </para>
+
+ <para>
+ This functionality is available only on the platforms which provide the APIs
+ necessary to reserve contiguous address space and to allocate or free memory
+ in that address space on demand. Macro <symbol>HAVE_RESIZABLE_SHMEM</symbol>
+ is defined on such platforms. It can be used to guard code related to
+ resizing a shared memory structure. The functionality is available on with
+ mmap'ed memory, so subsystems which use resizable structures may have to
+ addtionally disable resizable memory usage when
+ <symbol>shared_memory_type</symbol> is not <symbol>SHMEM_TYPE_MMAP</symbol>.
+ A GUC <xref linkend="guc-have-resizable-shared-memory"/> is set to
+ <literal>on</literal> when this functionality is available in a running
+ server, <literal>off</literal> otherwise.
+ </para>
+ </sect3>
+
<sect3 id="xfunc-shared-addin-dynamic">
<title>Allocating Dynamic Shared Memory After Startup</title>
diff --git a/meson.build b/meson.build
index 43d5ffc30b1..790845762e1 100644
--- a/meson.build
+++ b/meson.build
@@ -2904,6 +2904,22 @@ decl_checks = [
['timingsafe_bcmp', 'string.h'],
]
+# Linux-specific madvise constants needed for resizable shared memory.
+# Usually we use AC_CHECK_DECLS to check for function declarations, but in this
+# case we are using it to detect existence of constants. These constants are
+# used to define HAVE_RESIZABLE_SHMEM which is used in storage/pg_shmem.h as
+# well as storage/shmem.h. The first abstracts the APIs to allocate shared
+# memory segments from the operating system whereas the second abstracts APIs to
+# allocate shared memory to various subsystems. Since they are related but
+# orthogonal to each other, including any one of them in the other file doesn't
+# make sense. pg_config_manual.h is the only place where HAVE_RESIZABLE_SHMEM
+# can be defined and made available to both without including sys/mman.h. But
+# for that we need constants that indicate the existence of following defines.
+decl_checks += [
+ ['MADV_POPULATE_WRITE', 'sys/mman.h'],
+ ['MADV_REMOVE', 'sys/mman.h'],
+]
+
# Need to check for function declarations for these functions, because
# checking for library symbols wouldn't handle deployment target
# restrictions on macOS
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..bb2a81417c6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,61 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be freed")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not free shared memory: %m")));
+#endif
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be allocated at runtime")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not allocate shared memory: %m")));
+#endif
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..c1f30665e66 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -648,3 +648,48 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
+/*
+ * PGSharedMemoryEnsureFreed / PGSharedMemoryEnsureAllocated
+ *
+ * Not supported on Windows. These are only meaningful on platforms with
+ * resizable shared memory (mmap + madvise).
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
+
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index bf6b81e621b..4c6ece598b1 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -192,6 +192,17 @@ InitializeShmemGUCs(void)
Size size_b;
Size size_mb;
Size hp_size;
+ bool have_resizable_shmem;
+
+ /* Does this server support resizable shared memory? */
+#ifdef HAVE_RESIZABLE_SHMEM
+ have_resizable_shmem = (shared_memory_type == SHMEM_TYPE_MMAP);
+#else
+ have_resizable_shmem = false;
+#endif
+ SetConfigOption("have_resizable_shared_memory",
+ have_resizable_shmem ? "on" : "off",
+ PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
/*
* Calculate the shared memory size and round up to the nearest megabyte.
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 1ebffe5a32a..8f006967790 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,11 +19,11 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * This module provides facilities to allocate fixed-size structures in shared
- * memory, for things like variables shared between all backend processes.
- * Each such structure has a string name to identify it, specified when it is
- * requested. shmem_hash.c provides a shared hash table implementation on top
- * of that.
+ * This module provides facilities to allocate fixed-size as well as resizable
+ * structures in shared memory, for things like variables shared between all
+ * backend processes. Each such structure has a string name to identify it,
+ * specified when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of fixed-size structures.
*
* Shared memory areas should usually not be allocated after postmaster
* startup, although we do allow small allocations later for the benefit of
@@ -102,6 +102,24 @@
* (*options->ptr), and calls the attach_fn callback, if any, for additional
* per-backend setup.
*
+ * Resizable shared memory structures
+ * ----------------------------------
+ *
+ * In order to allocate resizable shared memory structures, set
+ * ShmemRequestStructOpts::maximum_size to the maximum size that the structure
+ * can grow to. The address space for the maximum size will be reserved at
+ * startup, but memory is allocated or freed as the structure grows or shrinks
+ * respectively. ShmemRequestStructOpts::size should be set to the initial size
+ * of the structure, which is the amount of memory allocated at the startup.
+ * Optionally, ShmemRequestStructOpts::minimum_size can be set to the minimum
+ * size that the structure can shrink to. After startup, the structure can be
+ * resized by calling ShmemResizeStruct() by passing it the ShmemStructDesc for
+ * the structure and the new size. ShmemResizeStruct() enforces that the new
+ * size is within [minimum_size, maximum_size].
+ *
+ * While resizable structures can be created after the startup, the memory
+ * available for them is quite limited.
+ *
* Legacy ShmemInitStruct()/ShmemInitHash() functions
* --------------------------------------------------
*
@@ -167,6 +185,16 @@ typedef struct
ShmemRequestKind kind;
} ShmemRequest;
+/*
+ * A convenient macro to get the space required for a shmem request consistently.
+ * A resizable structure, requested by non-zero maximum_size, requires space for
+ * its maximum size. Please note that on the platforms that do not support
+ * resizable shmem, the maximum_size is ensured to be 0 i.e. all the structures
+ * are treated as fixed-size structures.
+ */
+#define SHMEM_REQUEST_SPACE_SIZE(request) \
+ ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+
static List *pending_shmem_requests;
/*
@@ -269,6 +297,10 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+ Size minimum_size; /* the minimum size the structure can shrink
+ * to */
+ Size maximum_size; /* the maximum size the structure can grow to */
+ Size reserved_space; /* the total address space reserved */
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -277,6 +309,7 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -342,11 +375,25 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->name == NULL)
elog(ERROR, "shared memory request is missing 'name' option");
+#ifndef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0)
+ elog(ERROR, "resizable shared memory is not supported on this platform");
+#else
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
if (IsUnderPostmaster)
{
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size < 0 && options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
}
else
{
@@ -355,12 +402,36 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->minimum_size < 0)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
+ if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->maximum_size < 0)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
}
if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
+ if (options->minimum_size > 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
+ options->minimum_size > options->size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to size (%zd)",
+ options->name, options->minimum_size, options->size);
+
+ if (options->maximum_size > 0 && options->size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
+ options->name, options->maximum_size, options->size);
+
+ if (options->minimum_size > 0 && options->maximum_size > 0 &&
+ options->minimum_size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to maximum size (%zd)",
+ options->name, options->minimum_size, options->maximum_size);
+
/* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -382,8 +453,8 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
}
/*
- * ShmemGetRequestedSize() --- estimate the total size of all registered shared
- * memory structures.
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
*
* This is called at postmaster startup, before the shared memory segment has
* been created.
@@ -408,7 +479,7 @@ ShmemGetRequestedSize(void)
alignment = PG_CACHE_LINE_SIZE;
size = TYPEALIGN(alignment, size);
- size = add_size(size, request->options->size);
+ size = add_size(size, SHMEM_REQUEST_SPACE_SIZE(request));
}
return size;
@@ -515,6 +586,7 @@ InitShmemIndexEntry(ShmemRequest *request)
ShmemIndexEnt *index_entry;
bool found;
size_t allocated_size;
+ size_t requested_size;
void *structPtr;
/* look it up in the shmem index */
@@ -532,10 +604,18 @@ InitShmemIndexEntry(ShmemRequest *request)
}
/*
- * We inserted the entry to the shared memory index. Allocate requested
- * amount of shared memory for it, and initialize the index entry.
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of address space in the shared memory segment for it, and do
+ * basic initializion. The memory gets allocated during initialization as
+ * the corresponding memory pages are written to. Allocate enough space
+ * for a resizable structure to grow to its maximum size. It is expected
+ * that the initialization callback will use only as much memory as the
+ * initial size of the resizable structure. (Well, if it doesn't, more
+ * memory will be allocated initially than expected, no further harm is
+ * done.)
*/
- structPtr = ShmemAllocRaw(request->options->size,
+ requested_size = SHMEM_REQUEST_SPACE_SIZE(request);
+ structPtr = ShmemAllocRaw(requested_size,
request->options->alignment,
&allocated_size);
if (structPtr == NULL)
@@ -544,13 +624,27 @@ InitShmemIndexEntry(ShmemRequest *request)
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
+ errmsg("not enough shared memory space for data structure"
" \"%s\" (%zu bytes requested)",
- name, request->options->size)));
+ name, requested_size)));
}
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
+ index_entry->reserved_space = allocated_size;
+ if (request->options->maximum_size > 0)
+ {
+ index_entry->minimum_size = request->options->minimum_size;
+ index_entry->maximum_size = request->options->maximum_size;
+
+ /* Adjust allocated size of a resizable structure. */
+ index_entry->allocated_size = EstimateAllocatedSize(index_entry);
+ }
+ else
+ {
+ index_entry->minimum_size = request->options->size;
+ index_entry->maximum_size = request->options->size;
+ }
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -595,7 +689,7 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the size in the index matches the request */
+ /* Check that the sizes in the index match the request. */
if (index_entry->size != request->options->size &&
request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -605,6 +699,40 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
+ /*
+ * For resizable structures, also check that minimum_size and maximum_size
+ * match. For fixed-size structures, these are derived (set to size) in
+ * the index entry and not meaningful in the request.
+ */
+ if (request->options->maximum_size != 0)
+ {
+ if (index_entry->minimum_size != request->options->minimum_size &&
+ request->options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different minimum_size: existing %zu, requested %zu",
+ name, index_entry->minimum_size,
+ request->options->minimum_size)));
+ }
+
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+ }
+ else
+ {
+ if (index_entry->minimum_size != index_entry->maximum_size)
+ elog(ERROR, "shared memory struct \"%s\" was created as resizable, but requested as fixed-size",
+ name);
+ }
+
/*
* Re-establish the caller's pointer variable, or do other actions to
* attach depending on the kind of shmem area it is.
@@ -626,6 +754,127 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
+/*
+ * Estimate the actual memory allocated for a resizable structure.
+ *
+ * ... based on the assumption that the memory is allocated in pages.
+ *
+ * The memory pages covered by the current size of a resizable structure are
+ * fully allocated when the currently allocated part of the structure is written
+ * to. The memory page where the maximal structure ends also hosts the next
+ * structure, unless the maximal structure ends on a page boundary. Hence that
+ * page is allocated when the next structure is written to. The memory pages
+ * between the page where the current structure ends and the page where the next
+ * structure starts remain unallocated. Thus the memory allocated for a
+ * resizable structure can be estimated as the total address space reserved for
+ * the structure minus the unallocated memory pages between the current end and
+ * the next structure.
+ */
+static Size
+EstimateAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ Assert(entry->maximum_size >= entry->size);
+ Assert(entry->reserved_space >= entry->maximum_size);
+
+ if (align_end < floor_max_end)
+ return entry->reserved_space - (floor_max_end - align_end);
+
+ return entry->reserved_space;
+}
+
+/*
+ * ShmemResizeStruct() --- resize a resizable shared memory structure.
+ *
+ * The new size must be within [minimum_size, maximum_size]. If the structure
+ * is being shrunk, the memory pages that are no longer needed are freed. If
+ * the structure is being expanded, the memory pages that are needed for the
+ * new size are allocated. See EstimateAllocatedSize() for explanation of which
+ * pages are allocated for a resizable structure.
+ */
+void
+ShmemResizeStruct(const char *name, Size new_size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /*
+ * Resizable shared memory structures are only supported with mmap'ed
+ * memory.
+ */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ Assert(result);
+
+ if (result->minimum_size == result->maximum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ if (new_size < result->minimum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("cannot shrink shared memory structure \"%s\" below minimum size"
+ " (requested %zu bytes, minimum %zu bytes)",
+ name, new_size, result->minimum_size)));
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\""
+ "(required %zu bytes, reserved %zu bytes)",
+ name, new_size, result->maximum_size)));
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the
+ * start of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+
+ if (max_end > new_end)
+ PGSharedMemoryEnsureFreed(new_end, max_end - new_end);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+
+ if (new_end > struct_start)
+ PGSharedMemoryEnsureAllocated(struct_start, new_end - struct_start);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ result->allocated_size = EstimateAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+#endif
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -732,6 +981,11 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(!found);
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
+#ifdef HAVE_RESIZABLE_SHMEM
+ result->minimum_size = result->size;
+ result->maximum_size = result->size;
+ result->reserved_space = result->allocated_size;
+#endif
result->location = ShmemAllocator->index;
}
}
@@ -1075,7 +1329,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 7
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1097,7 +1351,17 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
- named_allocated += ent->allocated_size;
+ values[4] = Int64GetDatum(ent->minimum_size);
+ values[5] = Int64GetDatum(ent->maximum_size);
+ values[6] = Int64GetDatum(ent->reserved_space);
+
+ /*
+ * Keep track of the total reserved space for named shmem areas, to be
+ * able to calculate the amount of shared memory allocated for
+ * anonymous areas and the amount of free shared memory at the end of
+ * the segment.
+ */
+ named_allocated += ent->reserved_space;
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -1108,6 +1372,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = values[2];
+ values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1116,6 +1383,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = values[2];
+ values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -1303,23 +1573,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index fcb6ab80583..22b7e461d3a 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1219,6 +1219,13 @@
max => '1000.0',
},
+{ name => 'have_resizable_shared_memory', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+ short_desc => 'Shows whether the running server supports resizable shared memory.',
+ flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE',
+ variable => 'have_resizable_shared_memory_enabled',
+ boot_val => 'HAVE_RESIZABLE_SHARED_MEMORY_ENABLED',
+},
+
{ name => 'hba_file', type => 'string', context => 'PGC_POSTMASTER', group => 'FILE_LOCATIONS',
short_desc => 'Sets the server\'s "hba" configuration file.',
flags => 'GUC_SUPERUSER_ONLY',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d9ca13baff9..924f95a4a70 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -653,6 +653,13 @@ static bool assert_enabled = DEFAULT_ASSERT_ENABLED;
#endif
static bool exec_backend_enabled = EXEC_BACKEND_ENABLED;
+#ifdef HAVE_RESIZABLE_SHMEM
+#define HAVE_RESIZABLE_SHARED_MEMORY_ENABLED true
+#else
+#define HAVE_RESIZABLE_SHARED_MEMORY_ENABLED false
+#endif
+static bool have_resizable_shared_memory_enabled = HAVE_RESIZABLE_SHARED_MEMORY_ENABLED;
+
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 99fa9a6ede2..3a622525dfc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8709,8 +8709,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,minimum_size,maximum_size,reserved_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9f6d512347e..8f2a59ec3a8 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -85,6 +85,14 @@
don't. */
#undef HAVE_DECL_F_FULLFSYNC
+/* Define to 1 if you have the declaration of `MADV_POPULATE_WRITE', and to 0
+ if you don't. */
+#undef HAVE_DECL_MADV_POPULATE_WRITE
+
+/* Define to 1 if you have the declaration of `MADV_REMOVE', and to 0 if you
+ don't. */
+#undef HAVE_DECL_MADV_REMOVE
+
/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
don't. */
#undef HAVE_DECL_MEMSET_S
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 521b49b8888..b09d6c91324 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -131,6 +131,15 @@
#define EXEC_BACKEND
#endif
+/*
+ * HAVE_RESIZABLE_SHMEM indicates whether resizable shared memory structures are
+ * supported. The implementation requires Linux-specific madvise constants
+ * (MADV_REMOVE and MADV_POPULATE_WRITE).
+ */
+#if HAVE_DECL_MADV_REMOVE && HAVE_DECL_MADV_POPULATE_WRITE && !defined(EXEC_BACKEND)
+#define HAVE_RESIZABLE_SHMEM
+#endif
+
/*
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
* posix_fadvise() kernel call. Usually the automatic configure tests are
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..f0efbf2aec1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,9 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index af7fe893bc4..0e6d5a63f28 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,22 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+ /*
+ * Minimum size this structure can shrink to. Should be set to 0 for
+ * fixed-size structures.
+ */
+ ssize_t minimum_size;
+
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for large resizable structures which need several pages
+ * worth of contiguous memory. Should be set to 0 for fixed-size
+ * structures.
+ */
+ ssize_t maximum_size;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
@@ -168,6 +184,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+extern void ShmemResizeStruct(const char *name, Size new_size);
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 0a74ab5c86f..fa29f486354 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_autovacuum \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 4bca42bb370..d69c37d3d6a 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..86bf17bef4a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,25 @@
+# src/test/modules/resizable_shmem/Makefile
+
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+MODULES = resizable_shmem
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+
+TAP_TESTS = 1
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..b4b07336dc3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,39 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory mapped against the main shared memory segment in
+-- the backend where this function runs.
+CREATE FUNCTION resizable_shmem_usage()
+RETURNS bigint
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..6fd9e02f7b3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,317 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <limits.h>
+#include <stdio.h>
+
+#include "commands/extension.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+
+PG_MODULE_MAGIC;
+
+/* Default values for the GUCs controlling structure size */
+#define TEST_INITIAL_ENTRIES_DEFAULT (25 * 1024 * 1024) /* ~100MB */
+#define TEST_MAX_ENTRIES_DEFAULT (100 * 1024 * 1024) /* ~400MB */
+
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * The test performs resizing, reads or writes, only one at a time and never
+ * concurrently. Hence, there is no need for locks in the test structure.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+/* GUC variables controlling the size of the test structure */
+static int test_initial_entries;
+static int test_max_entries;
+
+/* Whether to use SHMEM_ATTACH_UNKNOWN_SIZE when attaching to the shared memory */
+static bool use_unknown_size = false;
+
+static void resizable_shmem_request(void *arg);
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemCallbacks shmem_callbacks = {
+ .request_fn = resizable_shmem_request,
+ .init_fn = resizable_shmem_shmem_init,
+};
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ int guc_context;
+
+ /*
+ * Use PGC_POSTMASTER when loaded at startup so the values are fixed once
+ * the shared memory segment is created. When loaded after startup
+ * PGC_POSTMASTER is not allowed, so we use PGC_SIGHUP instead. Although
+ * we do not intend to change these values at config reload, PGC_SIGHUP is
+ * the least permissive context that allows defining the GUC after startup
+ * and still prevents it from being changed via SET.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ guc_context = PGC_POSTMASTER;
+ else
+ {
+ guc_context = PGC_SIGHUP;
+ shmem_callbacks.flags = SHMEM_CALLBACKS_ALLOW_AFTER_STARTUP;
+ }
+
+ DefineCustomIntVariable("resizable_shmem.initial_entries",
+ "Initial number of entries in the test structure.",
+ NULL,
+ &test_initial_entries,
+ TEST_INITIAL_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ DefineCustomIntVariable("resizable_shmem.max_entries",
+ "Maximum number of entries in the test structure.",
+ NULL,
+ &test_max_entries,
+ TEST_MAX_ENTRIES_DEFAULT,
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ /*
+ * When loaded after startup by a backend that is not creating the
+ * extension, the shared memory might have been resized to a size other
+ * than the initial size. Use SHMEM_ATTACH_UNKNOWN_SIZE to attach without
+ * knowing the exact size.
+ */
+ if (!process_shared_preload_libraries_in_progress && !creating_extension)
+ use_unknown_size = true;
+
+ RegisterShmemCallbacks(&shmem_callbacks);
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void *arg)
+{
+ Size initial_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+
+/*
+ * Create resizable structure on the platforms which support it. Otherwise create
+ * as a fixed-size structure. Other way would be to conditionally include
+ * .maximum_size in the call to ShmemRequestStruct().
+ */
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size max_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(test_max_entries, TEST_ENTRY_SIZE));
+ Size min_size = offsetof(TestResizableShmemStruct, data);
+
+#else
+ Size max_size = 0;
+ Size min_size = 0;
+#endif
+
+ /* Register our resizable shared memory structure */
+ ShmemRequestStruct(.name = "resizable_shmem",
+ .size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+ .minimum_size = min_size,
+ .maximum_size = max_size,
+ .ptr = (void **) &resizable_shmem,
+ );
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(resizable_shmem != NULL);
+
+ resizable_shmem->num_entries = test_initial_entries;
+ memset(resizable_shmem->data, 0, mul_size(test_initial_entries, TEST_ENTRY_SIZE));
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ *
+ * On the plaforms which do not support resizable shared memory,
+ * ShmemResizeStruct() will raise an error, so this function will fail if the
+ * caller tries to resize the structure.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = add_size(offsetof(TestResizableShmemStruct, data),
+ mul_size(new_entries, TEST_ENTRY_SIZE));
+ ShmemResizeStruct("resizable_shmem", new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Return the memory mapped against the main shared memory segment in this
+ * backend.
+ *
+ * The VMA containing our resizable_shmem pointer is used to determine the main
+ * memory segment. RSS + Swap (in bytes) for that VMS from /proc/self/smaps is
+ * returned.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+ FILE *f;
+ char line[256];
+ int64 rss_kb = -1;
+ int64 swap_kb = -1;
+ uintptr_t target = (uintptr_t) resizable_shmem;
+ bool in_target_vma = false;
+ size_t result;
+
+ f = fopen("/proc/self/smaps", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/smaps: %m")));
+
+ while (fgets(line, sizeof(line), f) != NULL)
+ {
+ unsigned long start;
+ unsigned long end;
+
+ if (sscanf(line, "%lx-%lx", &start, &end) == 2)
+ {
+ in_target_vma = (target >= start && target < end);
+ }
+ else if (in_target_vma)
+ {
+ if (rss_kb == -1)
+ sscanf(line, "Rss: %ld kB", &rss_kb);
+ if (swap_kb == -1)
+ sscanf(line, "Swap: %ld kB", &swap_kb);
+ if (rss_kb >= 0 && swap_kb >= 0)
+ break;
+ }
+ }
+
+ fclose(f);
+
+ result = rss_kb >= 0 ? mul_size(rss_kb, 1024) : 0;
+ result = add_size(result, swap_kb >= 0 ? mul_size(swap_kb, 1024) : 0);
+
+ PG_RETURN_INT64(result);
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..8031303fe0e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,4 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..24bea91a401
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,241 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality, both when loaded at startup via
+# shared_preload_libraries and when loaded after startup (late allocation).
+
+# Verify that RssShmem does not exceed the total allocated shared memory.
+# Allocated shared memory should be mostly the memory allocated to the
+# resizable_shmem structure. Any large increase in expected RssShmem should
+# reflect the unexpected increase in memory allocated to the resizable_shmem
+# structure.
+sub check_shmem_usage
+{
+ my ($session, $label, $node) = @_;
+
+ my $rss_shmem = $session->query_safe('SELECT resizable_shmem_usage();',
+ verbose => 0);
+ my $total_alloc = $node->safe_psql('postgres',
+ "SELECT sum(allocated_size) FROM pg_shmem_allocations;");
+
+ note "$label: RssShmem=$rss_shmem, sum(allocated_size)=$total_alloc";
+ ok($rss_shmem <= $total_alloc, "$label: RssShmem does not exceed total allocated size");
+}
+
+# Test a resize operation: resize, verify old data, write new data, verify
+# new data, and check shmem usage. Returns updated ($num_entries, $value).
+sub test_resize
+{
+ my ($node, $prefix, $old_num_entries, $old_value, $new_num_entries, $new_value, $label) = @_;
+
+ $label = "$prefix: $label";
+
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $session1->query_safe("SELECT resizable_shmem_resize($new_num_entries);",
+ verbose => 0);
+
+ # Old data should still be intact in the (possibly smaller) area
+ my $readable_entries = ($new_num_entries < $old_num_entries) ? $new_num_entries : $old_num_entries;
+ is($session1->query_safe("SELECT resizable_shmem_read($readable_entries, $old_value);",
+ verbose => 0),
+ 't', "old data readable after $label");
+
+ $session2->query_safe("SELECT resizable_shmem_write($new_value);",
+ verbose => 0);
+ is($session1->query_safe("SELECT resizable_shmem_read($new_num_entries, $new_value);",
+ verbose => 0),
+ 't', "new data readable after $label");
+
+ check_shmem_usage($session1, "$label (session 1)", $node);
+ check_shmem_usage($session2, "$label (session 2)", $node);
+
+ $session1->quit;
+ $session2->quit;
+
+ return ($new_num_entries, $new_value);
+}
+
+# Run the full suite of resizable shared memory tests on the given node.
+sub run_resizable_tests
+{
+ my ($node, $initial_entries, $max_entries, $prefix) = @_;
+
+ my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+
+ my $num_entries = $initial_entries;
+
+ # Basic read/write should work on all platforms
+ my $value = 100;
+ $node->safe_psql('postgres', "SELECT resizable_shmem_write($value);");
+ is($node->safe_psql('postgres', "SELECT resizable_shmem_read($num_entries, $value);"),
+ 't', "$prefix: data read after write successful");
+
+ if ($have_resizable_shmem)
+ {
+ # Initial structure state
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $value = 100;
+ # Write and read the initial set of entries.
+ $session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+ is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);",
+ verbose => 0),
+ 't', "$prefix: data read after write successful");
+ check_shmem_usage($session1, "$prefix: initial write (session 1)", $node);
+ check_shmem_usage($session2, "$prefix: initial write (session 2)", $node);
+ $session1->quit;
+ $session2->quit;
+
+ # Verify no other structure is resizable
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> minimum_size;"),
+ '0', "$prefix: no other resizable structures");
+
+ # Resize to maximum
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $max_entries, 500, 'resize to maximum');
+
+ # Shrink to 75% of max
+ my $shrink_entries = int($max_entries * 3 / 4);
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $shrink_entries, 999, 'shrinking');
+
+ # Resize to the same size (no-op)
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $num_entries, 1999, 'no-op resize');
+
+ # Test resize failure (attempt to resize beyond max - should fail)
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ ok($ret != 0 || $stderr =~ /ERROR/, "$prefix: Resize beyond maximum fails");
+ }
+ else
+ {
+ # On unsupported platforms, resizing should fail with a clear error
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize($num_entries);");
+ ok($ret != 0, "$prefix: resize fails on unsupported platform");
+ like($stderr, qr/not supported/, "$prefix: resize error mentions not supported");
+ }
+}
+
+### Set up a test node.
+#
+#Configure minimal shared memory so that the resizable_shmem structure dominates
+#and any unexpected increase is easy to detect.
+#
+# Also disable huge pages so that RssShmem and allocated_size are comparable.
+# The latter is already aligned to the default page size.
+###
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+$node->init;
+
+$node->append_conf('postgresql.conf', 'huge_pages = off');
+$node->append_conf('postgresql.conf', 'shared_buffers = 128kB');
+$node->append_conf('postgresql.conf', 'max_connections = 5');
+$node->append_conf('postgresql.conf', 'max_worker_processes = 0');
+$node->append_conf('postgresql.conf', 'max_wal_senders = 0');
+$node->append_conf('postgresql.conf', 'max_prepared_transactions = 0');
+$node->append_conf('postgresql.conf', 'max_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'max_pred_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'wal_buffers = 32kB');
+
+###
+# Test 1: Startup allocation via shared_preload_libraries
+###
+my $startup_initial = 25 * 1024 * 1024;
+my $startup_max = 100 * 1024 * 1024;
+
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->append_conf('postgresql.conf', "resizable_shmem.initial_entries = $startup_initial");
+$node->append_conf('postgresql.conf', "resizable_shmem.max_entries = $startup_max");
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $startup_initial, $startup_max, 'startup');
+
+###
+# Test 2: Late allocation (loaded after startup, not in shared_preload_libraries).
+# Use much smaller sizes since only ~100KB of shared memory is available for
+# structures allocated after startup.
+###
+my $late_initial = 5 * 1024;
+my $late_max = 12 * 1024;
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM RESET shared_preload_libraries;
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $late_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $late_max;
+});
+$node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+$node->restart;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+run_resizable_tests($node, $late_initial, $late_max, 'late');
+
+###
+# Test sysv shared memory does not support resizable shmem. Only relevant on
+# platforms that support resizable shmem (HAVE_RESIZABLE_SHMEM), since the
+# module only sets maximum_size in that case.
+###
+my $resizable_shmem_binary = $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+if ($resizable_shmem_binary)
+{
+ ###
+ # Test 3: Verify that CREATE EXTENSION fails with sysv shared memory
+ # when loaded after startup (not in shared_preload_libraries).
+ ###
+ $node->safe_psql('postgres', 'DROP EXTENSION resizable_shmem;');
+
+ # Remove settings that would cause the library to auto-load at startup:
+ # shared_preload_libraries and module-prefixed GUCs. ALTER SYSTEM RESET
+ # only affects postgresql.auto.conf, so we must use adjust_conf to remove
+ # from postgresql.conf.
+ $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.conf', 'resizable_shmem.max_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'resizable_shmem.max_entries', undef);
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_memory_type = 'sysv';
+ });
+
+ $node->restart;
+
+ is($node->safe_psql('postgres', 'SHOW have_resizable_shared_memory'), 'off',
+ 'have_resizable_shared_memory is off with sysv');
+
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+ ok($ret != 0, 'CREATE EXTENSION fails with resizable shmem on sysv');
+ like($stderr, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'CREATE EXTENSION error mentions shared_memory_type = mmap requirement');
+
+ ###
+ # Test 4: Verify that resizable structures are also rejected with sysv
+ # shared memory when loaded at startup via shared_preload_libraries.
+ ###
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_preload_libraries = 'resizable_shmem';
+ ALTER SYSTEM SET resizable_shmem.initial_entries = $startup_initial;
+ ALTER SYSTEM SET resizable_shmem.max_entries = $startup_max;
+ });
+ $node->stop;
+
+ ok(!$node->start(fail_ok => 1),
+ 'server fails to start with resizable shmem on sysv');
+
+ my $log = slurp_file($node->logfile);
+ like($log, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'log mentions shared_memory_type = mmap requirement');
+}
+
+done_testing();
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c154f57682a..92a8f3b4873 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -45,5 +45,36 @@ else
ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
}
+###
+# Test that a fixed-size shared memory structure cannot be resized.
+# Only relevant on platforms that support resizable shmem.
+###
+my $have_resizable_shmem =
+ $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+
+if ($have_resizable_shmem)
+{
+ # Try expanding the fixed-size structure
+ my ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1000);");
+ isnt($ret, 0, "expanding a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "expand error message mentions not resizable");
+
+ # Try shrinking the fixed-size structure
+ ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1);");
+ isnt($ret, 0, "shrinking a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
+}
+
+###
+# Test that minimum_size and maximum_size equal size for a fixed-size structure
+# in pg_shmem_allocations.
+###
+is($node->safe_psql('postgres',
+ "SELECT minimum_size = size AND maximum_size = size FROM pg_shmem_allocations WHERE name = 'test_shmem area';"),
+ 't', "fixed-size structure has minimum_size = maximum_size = size");
+
$node->stop;
+
done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
index 2d01fd9256c..e169d0d7733 100644
--- a/src/test/modules/test_shmem/test_shmem--1.0.sql
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -7,3 +7,7 @@
CREATE FUNCTION get_test_shmem_attach_count()
RETURNS pg_catalog.int4 STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION test_shmem_resize_fixed(pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 9bd4012b435..0dd469891ee 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -99,3 +99,17 @@ get_test_shmem_attach_count(PG_FUNCTION_ARGS)
elog(ERROR, "shmem area not yet initialized");
PG_RETURN_INT32(TestShmem->attach_count);
}
+
+/*
+ * Attempt to resize the fixed-size shared memory structure. This should
+ * fail because the structure was not allocated with a maximum_size.
+ */
+PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
+Datum
+test_shmem_resize_fixed(PG_FUNCTION_ARGS)
+{
+ int32 new_size = PG_GETARG_INT32(0);
+
+ ShmemResizeStruct("test_shmem area", new_size);
+ PG_RETURN_VOID();
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a65a5bf0c4f..a882d799133 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,11 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ minimum_size,
+ maximum_size,
+ reserved_space
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, minimum_size, maximum_size, reserved_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9e6a39f5608..c5a84a2ae17 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3153,6 +3153,7 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
+TestResizableShmemStruct
TestShmemData
TestSpec
TestValueType
base-commit: b6ccd30d8ff6422ad0f79ce2fc801f2437d90664
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 12:24 Dagfinn Ilmari Mannsåker <[email protected]>
parent: Heikki Linnakangas <[email protected]>
2 siblings, 1 reply; 75+ messages in thread
From: Dagfinn Ilmari Mannsåker @ 2026-04-07 12:24 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Heikki Linnakangas <[email protected]> writes:
> Those are now committed, and here's a new version rebased over those
> changes.
I noticed this bit during my habitual morning skim of new commits:
> diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
> index c06b0e9b800..9981d6e212f 100644
> --- a/src/backend/utils/misc/injection_point.c
> +++ b/src/backend/utils/misc/injection_point.c
> @@ -17,6 +17,7 @@
> */
> #include "postgres.h"
>
> +#include "storage/subsystems.h"
> #include "utils/injection_point.h"
>
> #ifdef USE_INJECTION_POINTS
> @@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
>
> static HTAB *InjectionPointCache = NULL;
>
> +#ifdef USE_INJECTION_POINTS
> +static void InjectionPointShmemRequest(void *arg);
> +static void InjectionPointShmemInit(void *arg);
> +#endif
> +
This is already inside an `#ifdef USE_INJECTION_POINTS` guard (in fact
visible at the end of the previous diff hunk), no need for another one.
- ilmari
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 13:26 Heikki Linnakangas <[email protected]>
parent: Dagfinn Ilmari Mannsåker <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-07 13:26 UTC (permalink / raw)
To: Dagfinn Ilmari Mannsåker <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 07/04/2026 15:24, Dagfinn Ilmari Mannsåker wrote:
> Heikki Linnakangas <[email protected]> writes:
>
>> Those are now committed, and here's a new version rebased over those
>> changes.
>
> I noticed this bit during my habitual morning skim of new commits:
>
>> diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
>> index c06b0e9b800..9981d6e212f 100644
>> --- a/src/backend/utils/misc/injection_point.c
>> +++ b/src/backend/utils/misc/injection_point.c
>> @@ -17,6 +17,7 @@
>> */
>> #include "postgres.h"
>>
>> +#include "storage/subsystems.h"
>> #include "utils/injection_point.h"
>>
>> #ifdef USE_INJECTION_POINTS
>> @@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
>>
>> static HTAB *InjectionPointCache = NULL;
>>
>> +#ifdef USE_INJECTION_POINTS
>> +static void InjectionPointShmemRequest(void *arg);
>> +static void InjectionPointShmemInit(void *arg);
>> +#endif
>> +
>
> This is already inside an `#ifdef USE_INJECTION_POINTS` guard (in fact
> visible at the end of the previous diff hunk), no need for another one.
Fixed, thanks. I also noticed that the #include "storage/subsystems.h"
can be moved inside the #ifdef block; fixed that too.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 14:19 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-07 14:19 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Hi Heikki,
CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
init_fn/attach_fn callbacks. That looks wrong. Before this commit,
init or attach code was not run with the lock held. Any reason the
lock is held while calling init and attach callbacks. Since these
function can come from extensions, we don't have control on what goes
in those functions, and thus looks problematic. Further, it will
serialize all the attach_fn executions across backends, since each
will be run under the lock. In my case, the init_fn was performing
ShmemIndex lookup which deadlocked. It's questionable whether init
function should lookup ShmemIndex but, it's not something that needs
to be prohibited either.
Here's patch fixing it.
--
Best Wishes,
Ashutosh Bapat
On Tue, Apr 7, 2026 at 6:56 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 07/04/2026 15:24, Dagfinn Ilmari Mannsåker wrote:
> > Heikki Linnakangas <[email protected]> writes:
> >
> >> Those are now committed, and here's a new version rebased over those
> >> changes.
> >
> > I noticed this bit during my habitual morning skim of new commits:
> >
> >> diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
> >> index c06b0e9b800..9981d6e212f 100644
> >> --- a/src/backend/utils/misc/injection_point.c
> >> +++ b/src/backend/utils/misc/injection_point.c
> >> @@ -17,6 +17,7 @@
> >> */
> >> #include "postgres.h"
> >>
> >> +#include "storage/subsystems.h"
> >> #include "utils/injection_point.h"
> >>
> >> #ifdef USE_INJECTION_POINTS
> >> @@ -109,6 +110,11 @@ typedef struct InjectionPointCacheEntry
> >>
> >> static HTAB *InjectionPointCache = NULL;
> >>
> >> +#ifdef USE_INJECTION_POINTS
> >> +static void InjectionPointShmemRequest(void *arg);
> >> +static void InjectionPointShmemInit(void *arg);
> >> +#endif
> >> +
> >
> > This is already inside an `#ifdef USE_INJECTION_POINTS` guard (in fact
> > visible at the end of the previous diff hunk), no need for another one.
>
> Fixed, thanks. I also noticed that the #include "storage/subsystems.h"
> can be moved inside the #ifdef block; fixed that too.
>
> - Heikki
>
Attachments:
[text/x-patch] v20260407-0001-Unlock-ShmemIndexLock-before-calling-init_.patch (1.7K, 2-v20260407-0001-Unlock-ShmemIndexLock-before-calling-init_.patch)
download | inline diff:
From f08bf94b4f18c7aea9c6e7d3f321c153da7a76d8 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 7 Apr 2026 19:24:15 +0530
Subject: [PATCH v20260407] Unlock ShmemIndexLock before calling init_fn
callback
CallShmemCallbacksAfterStartup() calls init_fn or attach_fn callbacks
while holding ShmemIndexLock. Those callbacks do not require the lock to
be held. Before d4885af3d65325c1fcd319e98c634fde9a200443, code
initializing the shared structures executed without holding
ShmemIndexLock.
On the other hand, those callbacks may be performing long running
operations like disk access e.g. pgss_shmem_init. Holding a lightweight
lock that long should be avoided, if possible. It will cause a delay in
loading an extension in all the backends since all attach_fns will be
serialized.
Author: Ashutosh Bapat <[email protected]>
---
src/backend/storage/ipc/shmem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 1ebffe5a32a..66b95713020 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -956,6 +956,8 @@ CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
list_free_deep(pending_shmem_requests);
pending_shmem_requests = NIL;
+ LWLockRelease(ShmemIndexLock);
+
/* Finish by calling the appropriate subsystem-specific callback */
if (found_any)
{
@@ -968,7 +970,6 @@ CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
callbacks->init_fn(callbacks->opaque_arg);
}
- LWLockRelease(ShmemIndexLock);
shmem_request_state = SRS_DONE;
}
base-commit: 29e7dbf5e4daa8fafc2b18a1551e7b31c8847340
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 14:46 Ashutosh Bapat <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-07 14:46 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, Apr 7, 2026 at 3:36 PM Ashutosh Bapat
<[email protected]> wrote:
>
> On Mon, Apr 6, 2026 at 7:23 PM Ashutosh Bapat
> <[email protected]> wrote:
> >
> > I have kept these two patches separate from the main patch so that I
> > can remove them if others feel they are not worth including in the
> > feature.
>
> Here are patches rebased on the latest HEAD. No conflicts just rebase.
>
> Here are differences from the previous patchset.
>
> o. There are two patches in this patchset now. a. 0001 which supports
> resizable shared memory and is equivalent to 0001 + 0002 + 0004 + 0005
> from the previous patchset. b. 0002 which is 0006 from the previous
> patchset and adds support for protecting resizable shared memory
> structures. 0003, which added diagnostics to investigate CFBot
> failure, from the previous patchset is not required anymore since all
> tests pass with CFBot.
>
> o. I have merged 0002 into 0001 from the previous patchset since with
> that patch all platforms are green on CFBot. The resizable shared
> memory test now uses /proc/self/smaps instead of /proc/self/status to
> find the amount of memory allocated in the main shared memory segment
> of PostgreSQL.
>
> o. Merged 0004, which supported minimum_size, into 0001. Minimum_size
> would be useful to protect against accidental shrinkage of the
> resizable structures. It will help additional support for minimum
> sizes of GUCs like shared_buffers. It also makes it easy and intuitive
> to distinguish between fixed-size and resizable structures, and will
> be useful to find the minimum size of the shared memory segment.
>
> o. Merged 0005, which allows ABI compatibility between the binaries
> which support resizable shared memory and those which don't, into
> 0001. Apart from ABI compatibility, the code has lesser #ifdef blocks
> and thus easier to read and maintain.
>
> I didn't find it useful to keep 0004 and 0005 separate since they were
> interdependent and made review complicated and have higher chances of
> being acceptable.
>
> o. 0006 is still separate since I am not sure whether the
> functionality is absolutely needed at this time. In an offlist
> discussion, Andres mentioned that it is not strictly needed. The
> subsystem that uses the resizable shared memory can implement their
> own protection if required and integrate it in the subsystems specific
> synchronization. But Matthias thinks different. The API to add
> protection is platform dependent, so it's better to abstract it via
> shmem.c. If we decide to accept this patch, we should merge it into
> 0001 before committing.
>
> Also did some more cleanups and changed the name of the GUC
> have_resizable_shmem to have_resizable_shared_memory since shmem is an
> internal phrase.
>
> I am looking at merging the resizable_shmem module into test_shmem module next.
Here are patches with the test modules merged.
The merged module looks a bit rough to me and so does 0006. For
example, I am not sure whether calling ShmemStructProtect() from
init_fn is a good idea. See [1] for example. But init_fn is the last
chance for the subsystem to touch and setup the resizable structure
before it's opened to the wild. So, in the current infrastructure, I
don't see any better place to call ShmemStructProtect() either. If you
run tests after applying patch 0006, you will need to apply patch
attached to [1] as well; otherwise the test will hang.
[1] https://www.postgresql.org/message-id/[email protected]...
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] v20260407_2-0002-Add-support-to-protect-unused-resizable_sh.patch (10.7K, 2-v20260407_2-0002-Add-support-to-protect-unused-resizable_sh.patch)
download | inline diff:
From b4fd2ea31b670b9beac51b275cabc91e465eebe7 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Mon, 6 Apr 2026 19:00:14 +0530
Subject: [PATCH v20260407 2/3] Add support to protect unused resizable_shmem
structure
Add APIs to make the portion of resizable_shmem structure beyond its current
size inaccessible.
Author: Ashutosh Bapat <[email protected]>
Suggested-by: Matthias van de Meent <[email protected]>
---
doc/src/sgml/xfunc.sgml | 4 +-
src/backend/port/sysv_shmem.c | 55 ++++++++++++++++++++++
src/backend/port/win32_shmem.c | 8 ++++
src/backend/storage/ipc/shmem.c | 60 ++++++++++++++++++++++++
src/include/storage/pg_shmem.h | 1 +
src/include/storage/shmem.h | 1 +
src/test/modules/test_shmem/test_shmem.c | 43 +++++++++++++++++
7 files changed, 171 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 22f953db9d7..04b312fcf94 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3780,7 +3780,9 @@ my_shmem_init(void *arg)
additional synchronization between the resizing process and the processes
using the shared structure. Also it needs to implement additional protection
to prevent access to the part of the address space beyond the size of the
- structure when resizing it.
+ structure when resizing it. <function>ShmemProtectStruct</function> can be
+ called from every backend that may access the resizable structure for the
+ same.
</para>
<para>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index bb2a81417c6..14b6fa7f7e6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -1065,8 +1065,63 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ /*
+ * Ensure that MADV_POPULATE_WRITE can initialize the newly allocated
+ * pages.
+ */
+ if (mprotect(addr, size, PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+
if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
ereport(ERROR,
(errmsg("could not allocate shared memory: %m")));
#endif
}
+
+/*
+ * Set memory protection on the given region of shared memory.
+ *
+ * Makes [rw_start, rw_end) readable and writable, and [rw_end, prot_end)
+ * inaccessible.
+ *
+ * All addresses are expected to be page aligned.
+ *
+ * Only supported on platforms that support resizable shared memory.
+ */
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be protected at runtime")));
+
+ Assert(rw_start == (void *) TYPEALIGN(GetOSPageSize(), rw_start));
+ Assert(rw_end == (void *) TYPEALIGN(GetOSPageSize(), rw_end));
+ Assert(prot_end == (void *) TYPEALIGN(GetOSPageSize(), prot_end));
+ Assert(rw_end >= rw_start);
+
+ if (rw_end > rw_start)
+ {
+ if (mprotect(rw_start, (char *) rw_end - (char *) rw_start,
+ PROT_READ | PROT_WRITE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+
+ if (prot_end > rw_end)
+ {
+ if (mprotect(rw_end, (char *) prot_end - (char *) rw_end,
+ PROT_NONE) != 0)
+ ereport(ERROR,
+ (errmsg("could not protect shared memory: %m")));
+ }
+#endif
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index c1f30665e66..b5396e4a5e8 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -693,3 +693,11 @@ PGSharedMemoryEnsureAllocated(void *addr, Size size)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("resizable shared memory is not supported on this platform")));
}
+
+void
+PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 8f006967790..61808c7a8e5 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -875,6 +875,66 @@ ShmemResizeStruct(const char *name, Size new_size)
#endif
}
+/*
+ * ShmemProtectStruct() --- protect the unused portion of a resizable structure.
+ *
+ * Makes the region beyond the current size up to maximum_size inaccessible, and
+ * ensures the region up to the current size is readable and writable. Depending
+ * upon the platform, the protection honours the page boundaries. So it may be
+ * more permissible than strictly needed.
+ *
+ * Only works for resizable structures. Should be called in every backend that
+ * may access the resizable structure while resizing it.
+ */
+void
+ShmemProtectStruct(const char *name)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *rw_start;
+ char *rw_end;
+ char *prot_end;
+
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ if (result->minimum_size == result->maximum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ /* Resizable structures are only supported with mmap-based shared memory. */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* Make at least [location, location+size) readable and writable */
+ rw_start = (char *) TYPEALIGN_DOWN(page_size, result->location);
+ rw_end = (char *) TYPEALIGN(page_size,
+ (char *) result->location + result->size);
+
+ /*
+ * Make remaining portion inaccessible while making sure that the portion
+ * after maximum_size is not affected since it may be used by other
+ * structures.
+ */
+ prot_end = (char *) TYPEALIGN_DOWN(page_size,
+ (char *) result->location + result->maximum_size);
+
+ LWLockRelease(ShmemIndexLock);
+
+ PGSharedMemoryProtect(rw_start, rw_end, prot_end);
+#endif
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index f0efbf2aec1..5165b815cc1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -91,6 +91,7 @@ extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
+extern void PGSharedMemoryProtect(void *rw_start, void *rw_end, void *prot_end);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
extern Size GetOSPageSize(void);
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 0e6d5a63f28..f8ddb0dd7c0 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -185,6 +185,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
extern void ShmemResizeStruct(const char *name, Size new_size);
+extern void ShmemProtectStruct(const char *name);
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 72004df4083..e95225bf40a 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -133,6 +133,12 @@ test_shmem_init(void *arg)
/* Resizable structure should have been already allocated. Initialize it. */
Assert(ResizableShmem != NULL);
+
+#ifdef HAVE_RESIZABLE_SHMEM
+ /* Protect the shared memory structure in this backend. */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
ResizableShmem->num_entries = initial_entries;
memset(ResizableShmem->data, 0, mul_size(initial_entries, TEST_ENTRY_SIZE));
}
@@ -148,6 +154,16 @@ test_shmem_attach(void *arg)
if (attached_or_initialized)
elog(ERROR, "attach or initialize already called in this process");
attached_or_initialized = true;
+
+ /*
+ * Shared memory structure should have been already allocated. Initialize
+ * it.
+ */
+ Assert(ResizableShmem != NULL);
+
+#ifdef HAVE_RESIZABLE_SHMEM
+ ShmemProtectStruct("resizable_shmem");
+#endif
}
void
@@ -259,6 +275,7 @@ resizable_shmem_resize(PG_FUNCTION_ARGS)
new_size = add_size(offsetof(TestResizableData, data),
mul_size(new_entries, TEST_ENTRY_SIZE));
ShmemResizeStruct("resizable_shmem", new_size);
+ ShmemProtectStruct("resizable_shmem");
ResizableShmem->num_entries = new_entries;
PG_RETURN_VOID();
@@ -280,6 +297,19 @@ resizable_shmem_write(PG_FUNCTION_ARGS)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("resizable_shmem is not initialized")));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
/* Write the value to all current entries */
for (i = 0; i < ResizableShmem->num_entries; i++)
ResizableShmem->data[i] = entry_value;
@@ -309,6 +339,19 @@ resizable_shmem_read(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("entry_count %d is out of range (0..%d)", entry_count, ResizableShmem->num_entries)));
+#ifdef HAVE_RESIZABLE_SHMEM
+
+ /*
+ * Ideally the structure should be protected through a synchronization
+ * cycle across all the backends that may access the structure. But we
+ * don't implement any such synchronization in this test module to keep it
+ * simple. Given that ProcSignalBarrier mechanism is not extensible, we
+ * may not be able to do that as well here. Hence add protect just before
+ * accessing the structure.
+ */
+ ShmemProtectStruct("resizable_shmem");
+#endif
+
for (i = 0; i < entry_count; i++)
{
if (ResizableShmem->data[i] != entry_value)
--
2.34.1
[text/x-patch] v20260407_2-0001-resizable-shared-memory-structures.patch (68.3K, 3-v20260407_2-0001-resizable-shared-memory-structures.patch)
download | inline diff:
From 5cb135abc8f90dc7463d7a467729849fe19bf1dd Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH v20260407 1/3] resizable shared memory structures
Resizable shared memory structures can be allocated by specifying a new
member ShmemStructOpts::maximum_size. At the startup or when the
structure is created, we reserve address space worth maximum_size in the
shared memory segment. It is expected that the subsystem which creates
the structure would initialize only the initial size worth of memory
when creating it. In an mmap'ed memory, this should allocate memory
worth the initial size. It should not allocate maximum_size worth of
memory initially. As the structure is resized using ShmemResizeStruct()
memory is freed or allocated in chunks of memory pages when shrinking
and expanding the structure respectively.
Resizable shared memory feature depends upon existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check existence of madvise
separately, since existence of the constants implies existence of the
function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since that's
largely used for Windows where the APIs to free and allocate memory from
and to a given address space are not known to the author right now.
Given that PostgreSQL is used widely on Linux, providing this feature on
Linux covers benefits most of its users. Once we figure out the required
Windows APIs, we will support this feature on Windows as well.
The feature is also not available when Sys-V shared memory is used even
on Linux since we do not know whether required Sys-V APIs exist; mostly
they don't. Since that combination is only available for development and
testing, not supporting the feature there isn't going to impact
PostgreSQL users.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not support
the feature. But we also have run time checks to disable this feature
when Sys-V shared memory is used. In order to know whether a given
instance of running server supports resizable structures, we have
introduced GUC have_resizable_shmem.
Author: Ashutosh Bapat <[email protected]>
Reviewed-by: Matthias van de Meent <[email protected]>
---
configure.ac | 4 +
doc/src/sgml/config.sgml | 15 +
doc/src/sgml/system-views.sgml | 42 ++-
doc/src/sgml/xfunc.sgml | 54 +++
meson.build | 16 +
src/backend/port/sysv_shmem.c | 79 +++++
src/backend/port/win32_shmem.c | 45 +++
src/backend/storage/ipc/ipci.c | 11 +
src/backend/storage/ipc/shmem.c | 318 ++++++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/guc_tables.c | 7 +
src/include/catalog/pg_proc.dat | 4 +-
src/include/pg_config.h.in | 8 +
src/include/pg_config_manual.h | 9 +
src/include/storage/pg_shmem.h | 3 +
src/include/storage/shmem.h | 17 +
src/test/modules/test_shmem/meson.build | 1 +
.../test_shmem/t/001_late_shmem_alloc.pl | 31 ++
.../test_shmem/t/002_resizable_shmem.pl | 240 +++++++++++++
.../modules/test_shmem/test_shmem--1.0.sql | 34 ++
src/test/modules/test_shmem/test_shmem.c | 310 ++++++++++++++++-
src/test/regress/expected/rules.out | 7 +-
src/tools/pgindent/typedefs.list | 3 +-
23 files changed, 1213 insertions(+), 52 deletions(-)
create mode 100644 src/test/modules/test_shmem/t/002_resizable_shmem.pl
diff --git a/configure.ac b/configure.ac
index 8d176bd3468..99fcdab04e0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1913,6 +1913,10 @@ AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
+# Linux-specific madvise constants needed for resizable shared memory. See similar checks in meson.build for explanation of why these checks are here.
+AC_CHECK_DECLS([MADV_POPULATE_WRITE], [], [], [#include <sys/mman.h>])
+AC_CHECK_DECLS([MADV_REMOVE], [], [], [#include <sys/mman.h>])
+
AC_REPLACE_FUNCS(m4_normalize([
explicit_bzero
getopt
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3324d2d3c49..9f630b1a074 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -12138,6 +12138,21 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-have-resizable-shared-memory" xreflabel="have_resizable_shared_memory">
+ <term><varname>have_resizable_shared_memory</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>have_resizable_shared_memory</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports whether <productname>PostgreSQL</productname> has been built
+ with <literal>HAVE_RESIZABLE_SHMEM</literal> enabled and supports
+ <link linkend="xfunc-shared-addin-resizable">Resizable shared memory structures</link>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-huge-pages-status" xreflabel="huge_pages_status">
<term><varname>huge_pages_status</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2ebec6928d5..9bbbfdb37c5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,46 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>minimum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Minimum size in bytes that the resizable allocation can shrink to. Equals
+ <structfield>size</structfield>For fixed-size allocations, anonymous
+ allocations, and free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the resizable allocation can grow to. Equals
+ <structfield>size</structfield> For fixed-size allocations, anonymous
+ allocations, and free memory.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>reserved_space</structfield> <type>int8</type>
+ </para>
+ <para>
+ Address space reserved for the allocation in bytes. For resizable
+ structures, this is the total address space reserved to accommodate
+ growth up to <structfield>maximum_size</structfield>, and is greater
+ than or equal to <structfield>allocated_size</structfield>. For
+ fixed-size allocations, anonymous allocations, and free memory this
+ is same as <structfield>allocated_size</structfield>.
</para></entry>
</row>
</tbody>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 789cac9fcab..22f953db9d7 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3744,6 +3744,60 @@ my_shmem_init(void *arg)
</para>
</sect3>
+ <sect3 id="xfunc-shared-addin-resizable">
+ <title>Resizable shared memory structures</title>
+
+ <para>
+ A resizable memory structure can be requested using
+ <function>ShmemRequestStruct()</function> by passing
+ <parameter>maximum_size</parameter> along with
+ <parameter>size</parameter>. <parameter>maximum_size</parameter> is
+ maximum size upto which the structure can grow where as
+ <parameter>size</parameter> is the initial size of the structure.
+ Optionally, <parameter>minimum_size</parameter> can be set to the minimum
+ size that the structure can shrink to. While
+ contiguous address space worth <parameter>maximum_size</parameter> is
+ allocated to the structure, only memory worth <parameter>size</parameter>
+ bytes is allocated initially. The <function>init_fn</function> should only
+ initialize the <parameter>size</parameter> amount of memory. The actual
+ memory allocated to this structure at any point in time is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>allocated_size</structfield></link>
+ and the address space reserved for this structure is given by <link
+ linkend="view-pg-shmem-allocations"><structname>pg_shmem_allocations</structname>.<structfield>reserved_space</structfield></link>.
+ </para>
+
+ <para>
+ The structure can be resized using <function>ShmemResizeStruct()</function>
+ by passing it the structure's <parameter>name</parameter> and the new size
+ which can be anywhere between <parameter>minimum_size</parameter> and
+ <parameter>maximum_size</parameter>. If the new size is smaller than the
+ current size of the structure, the memory between the new size and current
+ size is freed while keeping the contents of the memory upto new size intact.
+ If the new size is greater than the current size, memory is allocated upto
+ new size while keeping the current contents of the structure intact. The
+ starting address of the structure does not change because of resizing
+ operation. The sybsystem using this feature needs to take care of the
+ additional synchronization between the resizing process and the processes
+ using the shared structure. Also it needs to implement additional protection
+ to prevent access to the part of the address space beyond the size of the
+ structure when resizing it.
+ </para>
+
+ <para>
+ This functionality is available only on the platforms which provide the APIs
+ necessary to reserve contiguous address space and to allocate or free memory
+ in that address space on demand. Macro <symbol>HAVE_RESIZABLE_SHMEM</symbol>
+ is defined on such platforms. It can be used to guard code related to
+ resizing a shared memory structure. The functionality is available on with
+ mmap'ed memory, so subsystems which use resizable structures may have to
+ addtionally disable resizable memory usage when
+ <symbol>shared_memory_type</symbol> is not <symbol>SHMEM_TYPE_MMAP</symbol>.
+ A GUC <xref linkend="guc-have-resizable-shared-memory"/> is set to
+ <literal>on</literal> when this functionality is available in a running
+ server, <literal>off</literal> otherwise.
+ </para>
+ </sect3>
+
<sect3 id="xfunc-shared-addin-dynamic">
<title>Allocating Dynamic Shared Memory After Startup</title>
diff --git a/meson.build b/meson.build
index be97e986e5d..c2b86f9104e 100644
--- a/meson.build
+++ b/meson.build
@@ -2904,6 +2904,22 @@ decl_checks = [
['timingsafe_bcmp', 'string.h'],
]
+# Linux-specific madvise constants needed for resizable shared memory.
+# Usually we use AC_CHECK_DECLS to check for function declarations, but in this
+# case we are using it to detect existence of constants. These constants are
+# used to define HAVE_RESIZABLE_SHMEM which is used in storage/pg_shmem.h as
+# well as storage/shmem.h. The first abstracts the APIs to allocate shared
+# memory segments from the operating system whereas the second abstracts APIs to
+# allocate shared memory to various subsystems. Since they are related but
+# orthogonal to each other, including any one of them in the other file doesn't
+# make sense. pg_config_manual.h is the only place where HAVE_RESIZABLE_SHMEM
+# can be defined and made available to both without including sys/mman.h. But
+# for that we need constants that indicate the existence of following defines.
+decl_checks += [
+ ['MADV_POPULATE_WRITE', 'sys/mman.h'],
+ ['MADV_REMOVE', 'sys/mman.h'],
+]
+
# Need to check for function declarations for these functions, because
# checking for library symbols wouldn't handle deployment target
# restrictions on macOS
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..bb2a81417c6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,61 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be freed")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not free shared memory: %m")));
+#endif
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only anonymous shared memory can be allocated at runtime")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not allocate shared memory: %m")));
+#endif
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..c1f30665e66 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -648,3 +648,48 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
+/*
+ * PGSharedMemoryEnsureFreed / PGSharedMemoryEnsureAllocated
+ *
+ * Not supported on Windows. These are only meaningful on platforms with
+ * resizable shared memory (mmap + madvise).
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
+
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index bf6b81e621b..4c6ece598b1 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -192,6 +192,17 @@ InitializeShmemGUCs(void)
Size size_b;
Size size_mb;
Size hp_size;
+ bool have_resizable_shmem;
+
+ /* Does this server support resizable shared memory? */
+#ifdef HAVE_RESIZABLE_SHMEM
+ have_resizable_shmem = (shared_memory_type == SHMEM_TYPE_MMAP);
+#else
+ have_resizable_shmem = false;
+#endif
+ SetConfigOption("have_resizable_shared_memory",
+ have_resizable_shmem ? "on" : "off",
+ PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
/*
* Calculate the shared memory size and round up to the nearest megabyte.
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 1ebffe5a32a..8f006967790 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,11 +19,11 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
- * This module provides facilities to allocate fixed-size structures in shared
- * memory, for things like variables shared between all backend processes.
- * Each such structure has a string name to identify it, specified when it is
- * requested. shmem_hash.c provides a shared hash table implementation on top
- * of that.
+ * This module provides facilities to allocate fixed-size as well as resizable
+ * structures in shared memory, for things like variables shared between all
+ * backend processes. Each such structure has a string name to identify it,
+ * specified when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of fixed-size structures.
*
* Shared memory areas should usually not be allocated after postmaster
* startup, although we do allow small allocations later for the benefit of
@@ -102,6 +102,24 @@
* (*options->ptr), and calls the attach_fn callback, if any, for additional
* per-backend setup.
*
+ * Resizable shared memory structures
+ * ----------------------------------
+ *
+ * In order to allocate resizable shared memory structures, set
+ * ShmemRequestStructOpts::maximum_size to the maximum size that the structure
+ * can grow to. The address space for the maximum size will be reserved at
+ * startup, but memory is allocated or freed as the structure grows or shrinks
+ * respectively. ShmemRequestStructOpts::size should be set to the initial size
+ * of the structure, which is the amount of memory allocated at the startup.
+ * Optionally, ShmemRequestStructOpts::minimum_size can be set to the minimum
+ * size that the structure can shrink to. After startup, the structure can be
+ * resized by calling ShmemResizeStruct() by passing it the ShmemStructDesc for
+ * the structure and the new size. ShmemResizeStruct() enforces that the new
+ * size is within [minimum_size, maximum_size].
+ *
+ * While resizable structures can be created after the startup, the memory
+ * available for them is quite limited.
+ *
* Legacy ShmemInitStruct()/ShmemInitHash() functions
* --------------------------------------------------
*
@@ -167,6 +185,16 @@ typedef struct
ShmemRequestKind kind;
} ShmemRequest;
+/*
+ * A convenient macro to get the space required for a shmem request consistently.
+ * A resizable structure, requested by non-zero maximum_size, requires space for
+ * its maximum size. Please note that on the platforms that do not support
+ * resizable shmem, the maximum_size is ensured to be 0 i.e. all the structures
+ * are treated as fixed-size structures.
+ */
+#define SHMEM_REQUEST_SPACE_SIZE(request) \
+ ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+
static List *pending_shmem_requests;
/*
@@ -269,6 +297,10 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+ Size minimum_size; /* the minimum size the structure can shrink
+ * to */
+ Size maximum_size; /* the maximum size the structure can grow to */
+ Size reserved_space; /* the total address space reserved */
} ShmemIndexEnt;
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
@@ -277,6 +309,7 @@ static bool firstNumaTouch = true;
static void CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks);
static void InitShmemIndexEntry(ShmemRequest *request);
static bool AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok);
+static Size EstimateAllocatedSize(ShmemIndexEnt *entry);
Datum pg_numa_available(PG_FUNCTION_ARGS);
@@ -342,11 +375,25 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->name == NULL)
elog(ERROR, "shared memory request is missing 'name' option");
+#ifndef HAVE_RESIZABLE_SHMEM
+ if (options->maximum_size > 0)
+ elog(ERROR, "resizable shared memory is not supported on this platform");
+#else
+ if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+#endif
+
if (IsUnderPostmaster)
{
if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size < 0 && options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
}
else
{
@@ -355,12 +402,36 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
if (options->size <= 0)
elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
options->size, options->name);
+ if (options->minimum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->minimum_size < 0)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
+ if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
+ if (options->maximum_size < 0)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
}
if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
+ if (options->minimum_size > 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
+ options->minimum_size > options->size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to size (%zd)",
+ options->name, options->minimum_size, options->size);
+
+ if (options->maximum_size > 0 && options->size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
+ options->name, options->maximum_size, options->size);
+
+ if (options->minimum_size > 0 && options->maximum_size > 0 &&
+ options->minimum_size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to maximum size (%zd)",
+ options->name, options->minimum_size, options->maximum_size);
+
/* Check that we're in the right state */
if (shmem_request_state != SRS_REQUESTING)
elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
@@ -382,8 +453,8 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
}
/*
- * ShmemGetRequestedSize() --- estimate the total size of all registered shared
- * memory structures.
+ * ShmemGetRequestedSize() --- estimate the total size of all registered shared
+ * memory structures.
*
* This is called at postmaster startup, before the shared memory segment has
* been created.
@@ -408,7 +479,7 @@ ShmemGetRequestedSize(void)
alignment = PG_CACHE_LINE_SIZE;
size = TYPEALIGN(alignment, size);
- size = add_size(size, request->options->size);
+ size = add_size(size, SHMEM_REQUEST_SPACE_SIZE(request));
}
return size;
@@ -515,6 +586,7 @@ InitShmemIndexEntry(ShmemRequest *request)
ShmemIndexEnt *index_entry;
bool found;
size_t allocated_size;
+ size_t requested_size;
void *structPtr;
/* look it up in the shmem index */
@@ -532,10 +604,18 @@ InitShmemIndexEntry(ShmemRequest *request)
}
/*
- * We inserted the entry to the shared memory index. Allocate requested
- * amount of shared memory for it, and initialize the index entry.
+ * We inserted the entry to the shared memory index. Allocate requested
+ * amount of address space in the shared memory segment for it, and do
+ * basic initializion. The memory gets allocated during initialization as
+ * the corresponding memory pages are written to. Allocate enough space
+ * for a resizable structure to grow to its maximum size. It is expected
+ * that the initialization callback will use only as much memory as the
+ * initial size of the resizable structure. (Well, if it doesn't, more
+ * memory will be allocated initially than expected, no further harm is
+ * done.)
*/
- structPtr = ShmemAllocRaw(request->options->size,
+ requested_size = SHMEM_REQUEST_SPACE_SIZE(request);
+ structPtr = ShmemAllocRaw(requested_size,
request->options->alignment,
&allocated_size);
if (structPtr == NULL)
@@ -544,13 +624,27 @@ InitShmemIndexEntry(ShmemRequest *request)
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("not enough shared memory for data structure"
+ errmsg("not enough shared memory space for data structure"
" \"%s\" (%zu bytes requested)",
- name, request->options->size)));
+ name, requested_size)));
}
index_entry->size = request->options->size;
index_entry->allocated_size = allocated_size;
index_entry->location = structPtr;
+ index_entry->reserved_space = allocated_size;
+ if (request->options->maximum_size > 0)
+ {
+ index_entry->minimum_size = request->options->minimum_size;
+ index_entry->maximum_size = request->options->maximum_size;
+
+ /* Adjust allocated size of a resizable structure. */
+ index_entry->allocated_size = EstimateAllocatedSize(index_entry);
+ }
+ else
+ {
+ index_entry->minimum_size = request->options->size;
+ index_entry->maximum_size = request->options->size;
+ }
/* Initialize depending on the kind of shmem area it is */
switch (request->kind)
@@ -595,7 +689,7 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the size in the index matches the request */
+ /* Check that the sizes in the index match the request. */
if (index_entry->size != request->options->size &&
request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
{
@@ -605,6 +699,40 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
name, index_entry->size, request->options->size)));
}
+ /*
+ * For resizable structures, also check that minimum_size and maximum_size
+ * match. For fixed-size structures, these are derived (set to size) in
+ * the index entry and not meaningful in the request.
+ */
+ if (request->options->maximum_size != 0)
+ {
+ if (index_entry->minimum_size != request->options->minimum_size &&
+ request->options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different minimum_size: existing %zu, requested %zu",
+ name, index_entry->minimum_size,
+ request->options->minimum_size)));
+ }
+
+ if (index_entry->maximum_size != request->options->maximum_size &&
+ request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ {
+ ereport(ERROR,
+ (errmsg("shared memory struct \"%s\" was created with"
+ " different maximum_size: existing %zu, requested %zu",
+ name, index_entry->maximum_size,
+ request->options->maximum_size)));
+ }
+ }
+ else
+ {
+ if (index_entry->minimum_size != index_entry->maximum_size)
+ elog(ERROR, "shared memory struct \"%s\" was created as resizable, but requested as fixed-size",
+ name);
+ }
+
/*
* Re-establish the caller's pointer variable, or do other actions to
* attach depending on the kind of shmem area it is.
@@ -626,6 +754,127 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return true;
}
+/*
+ * Estimate the actual memory allocated for a resizable structure.
+ *
+ * ... based on the assumption that the memory is allocated in pages.
+ *
+ * The memory pages covered by the current size of a resizable structure are
+ * fully allocated when the currently allocated part of the structure is written
+ * to. The memory page where the maximal structure ends also hosts the next
+ * structure, unless the maximal structure ends on a page boundary. Hence that
+ * page is allocated when the next structure is written to. The memory pages
+ * between the page where the current structure ends and the page where the next
+ * structure starts remain unallocated. Thus the memory allocated for a
+ * resizable structure can be estimated as the total address space reserved for
+ * the structure minus the unallocated memory pages between the current end and
+ * the next structure.
+ */
+static Size
+EstimateAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ Assert(entry->maximum_size >= entry->size);
+ Assert(entry->reserved_space >= entry->maximum_size);
+
+ if (align_end < floor_max_end)
+ return entry->reserved_space - (floor_max_end - align_end);
+
+ return entry->reserved_space;
+}
+
+/*
+ * ShmemResizeStruct() --- resize a resizable shared memory structure.
+ *
+ * The new size must be within [minimum_size, maximum_size]. If the structure
+ * is being shrunk, the memory pages that are no longer needed are freed. If
+ * the structure is being expanded, the memory pages that are needed for the
+ * new size are allocated. See EstimateAllocatedSize() for explanation of which
+ * pages are allocated for a resizable structure.
+ */
+void
+ShmemResizeStruct(const char *name, Size new_size)
+{
+#ifndef HAVE_RESIZABLE_SHMEM
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable shared memory is not supported on this platform")));
+#else
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /*
+ * Resizable shared memory structures are only supported with mmap'ed
+ * memory.
+ */
+ Assert(shared_memory_type == SHMEM_TYPE_MMAP);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *) hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shmem struct \"%s\" is not initialized", name)));
+
+ Assert(result);
+
+ if (result->minimum_size == result->maximum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("shared memory struct \"%s\" is not resizable", name)));
+
+ if (new_size < result->minimum_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("cannot shrink shared memory structure \"%s\" below minimum size"
+ " (requested %zu bytes, minimum %zu bytes)",
+ name, new_size, result->minimum_size)));
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\""
+ "(required %zu bytes, reserved %zu bytes)",
+ name, new_size, result->maximum_size)));
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the
+ * start of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+
+ if (max_end > new_end)
+ PGSharedMemoryEnsureFreed(new_end, max_end - new_end);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+
+ if (new_end > struct_start)
+ PGSharedMemoryEnsureAllocated(struct_start, new_end - struct_start);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ result->allocated_size = EstimateAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+#endif
+}
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -732,6 +981,11 @@ InitShmemAllocator(PGShmemHeader *seghdr)
Assert(!found);
result->size = ShmemAllocator->index_size;
result->allocated_size = ShmemAllocator->index_size;
+#ifdef HAVE_RESIZABLE_SHMEM
+ result->minimum_size = result->size;
+ result->maximum_size = result->size;
+ result->reserved_space = result->allocated_size;
+#endif
result->location = ShmemAllocator->index;
}
}
@@ -1075,7 +1329,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 7
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -1097,7 +1351,17 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
- named_allocated += ent->allocated_size;
+ values[4] = Int64GetDatum(ent->minimum_size);
+ values[5] = Int64GetDatum(ent->maximum_size);
+ values[6] = Int64GetDatum(ent->reserved_space);
+
+ /*
+ * Keep track of the total reserved space for named shmem areas, to be
+ * able to calculate the amount of shared memory allocated for
+ * anonymous areas and the amount of free shared memory at the end of
+ * the segment.
+ */
+ named_allocated += ent->reserved_space;
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -1108,6 +1372,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = values[2];
+ values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -1116,6 +1383,9 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = values[2];
+ values[5] = values[2];
+ values[6] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -1303,23 +1573,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index fcb6ab80583..22b7e461d3a 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1219,6 +1219,13 @@
max => '1000.0',
},
+{ name => 'have_resizable_shared_memory', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+ short_desc => 'Shows whether the running server supports resizable shared memory.',
+ flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE',
+ variable => 'have_resizable_shared_memory_enabled',
+ boot_val => 'HAVE_RESIZABLE_SHARED_MEMORY_ENABLED',
+},
+
{ name => 'hba_file', type => 'string', context => 'PGC_POSTMASTER', group => 'FILE_LOCATIONS',
short_desc => 'Sets the server\'s "hba" configuration file.',
flags => 'GUC_SUPERUSER_ONLY',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d9ca13baff9..924f95a4a70 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -653,6 +653,13 @@ static bool assert_enabled = DEFAULT_ASSERT_ENABLED;
#endif
static bool exec_backend_enabled = EXEC_BACKEND_ENABLED;
+#ifdef HAVE_RESIZABLE_SHMEM
+#define HAVE_RESIZABLE_SHARED_MEMORY_ENABLED true
+#else
+#define HAVE_RESIZABLE_SHARED_MEMORY_ENABLED false
+#endif
+static bool have_resizable_shared_memory_enabled = HAVE_RESIZABLE_SHARED_MEMORY_ENABLED;
+
static char *recovery_target_timeline_string;
static char *recovery_target_string;
static char *recovery_target_xid_string;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 99fa9a6ede2..3a622525dfc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8709,8 +8709,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,minimum_size,maximum_size,reserved_space}',
prosrc => 'pg_get_shmem_allocations',
proacl => '{POSTGRES=X,pg_read_all_stats=X}' },
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 4f8113c144b..89c4871532e 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -85,6 +85,14 @@
don't. */
#undef HAVE_DECL_F_FULLFSYNC
+/* Define to 1 if you have the declaration of `MADV_POPULATE_WRITE', and to 0
+ if you don't. */
+#undef HAVE_DECL_MADV_POPULATE_WRITE
+
+/* Define to 1 if you have the declaration of `MADV_REMOVE', and to 0 if you
+ don't. */
+#undef HAVE_DECL_MADV_REMOVE
+
/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
don't. */
#undef HAVE_DECL_MEMSET_S
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 521b49b8888..b09d6c91324 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -131,6 +131,15 @@
#define EXEC_BACKEND
#endif
+/*
+ * HAVE_RESIZABLE_SHMEM indicates whether resizable shared memory structures are
+ * supported. The implementation requires Linux-specific madvise constants
+ * (MADV_REMOVE and MADV_POPULATE_WRITE).
+ */
+#if HAVE_DECL_MADV_REMOVE && HAVE_DECL_MADV_POPULATE_WRITE && !defined(EXEC_BACKEND)
+#define HAVE_RESIZABLE_SHMEM
+#endif
+
/*
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
* posix_fadvise() kernel call. Usually the automatic configure tests are
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..f0efbf2aec1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,9 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index af7fe893bc4..0e6d5a63f28 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -57,6 +57,22 @@ typedef struct ShmemStructOpts
*/
size_t alignment;
+ /*
+ * Minimum size this structure can shrink to. Should be set to 0 for
+ * fixed-size structures.
+ */
+ ssize_t minimum_size;
+
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for large resizable structures which need several pages
+ * worth of contiguous memory. Should be set to 0 for fixed-size
+ * structures.
+ */
+ ssize_t maximum_size;
+
/*
* When the shmem area is initialized or attached to, pointer to it is
* stored in *ptr. It usually points to a global variable, used to access
@@ -168,6 +184,7 @@ typedef struct ShmemCallbacks
extern void RegisterShmemCallbacks(const ShmemCallbacks *callbacks);
extern bool ShmemAddrIsValid(const void *addr);
+extern void ShmemResizeStruct(const char *name, Size new_size);
/*
* These macros provide syntactic sugar for calling the underlying functions
diff --git a/src/test/modules/test_shmem/meson.build b/src/test/modules/test_shmem/meson.build
index fb4bf328b8f..bf70b32aa1b 100644
--- a/src/test/modules/test_shmem/meson.build
+++ b/src/test/modules/test_shmem/meson.build
@@ -28,6 +28,7 @@ tests += {
'tap': {
'tests': [
't/001_late_shmem_alloc.pl',
+ 't/002_resizable_shmem.pl',
],
},
}
diff --git a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
index c154f57682a..92a8f3b4873 100644
--- a/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
+++ b/src/test/modules/test_shmem/t/001_late_shmem_alloc.pl
@@ -45,5 +45,36 @@ else
ok($attach_count1 == 0 && $attach_count2 == 0, "attach callback is not called when loaded via shared_preload_libraries");
}
+###
+# Test that a fixed-size shared memory structure cannot be resized.
+# Only relevant on platforms that support resizable shmem.
+###
+my $have_resizable_shmem =
+ $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+
+if ($have_resizable_shmem)
+{
+ # Try expanding the fixed-size structure
+ my ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1000);");
+ isnt($ret, 0, "expanding a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "expand error message mentions not resizable");
+
+ # Try shrinking the fixed-size structure
+ ($ret, $stdout, $stderr) =
+ $node->psql("postgres", "SELECT test_shmem_resize_fixed(1);");
+ isnt($ret, 0, "shrinking a fixed-size structure fails");
+ like($stderr, qr/is not resizable/, "shrink error message mentions not resizable");
+}
+
+###
+# Test that minimum_size and maximum_size equal size for a fixed-size structure
+# in pg_shmem_allocations.
+###
+is($node->safe_psql('postgres',
+ "SELECT minimum_size = size AND maximum_size = size FROM pg_shmem_allocations WHERE name = 'test_shmem area';"),
+ 't', "fixed-size structure has minimum_size = maximum_size = size");
+
$node->stop;
+
done_testing();
diff --git a/src/test/modules/test_shmem/t/002_resizable_shmem.pl b/src/test/modules/test_shmem/t/002_resizable_shmem.pl
new file mode 100644
index 00000000000..1c7a60407d8
--- /dev/null
+++ b/src/test/modules/test_shmem/t/002_resizable_shmem.pl
@@ -0,0 +1,240 @@
+# Copyright (c) 2025-2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality, both when loaded at startup via
+# shared_preload_libraries and when loaded after startup (late allocation).
+
+# Verify that RssShmem does not exceed the total allocated shared memory.
+# Allocated shared memory should be mostly the memory allocated to the resizable
+# structure. Any large increase in expected RssShmem should reflect the
+# unexpected increase in memory allocated to the resizable structure.
+sub check_shmem_usage
+{
+ my ($session, $label, $node) = @_;
+
+ my $rss_shmem = $session->query_safe('SELECT resizable_shmem_usage();',
+ verbose => 0);
+ my $total_alloc = $node->safe_psql('postgres',
+ "SELECT sum(allocated_size) FROM pg_shmem_allocations;");
+
+ note "$label: RssShmem=$rss_shmem, sum(allocated_size)=$total_alloc";
+ ok($rss_shmem <= $total_alloc, "$label: RssShmem does not exceed total allocated size");
+}
+
+# Test a resize operation: resize, verify old data, write new data, verify
+# new data, and check shmem usage. Returns updated ($num_entries, $value).
+sub test_resize
+{
+ my ($node, $prefix, $old_num_entries, $old_value, $new_num_entries, $new_value, $label) = @_;
+
+ $label = "$prefix: $label";
+
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $session1->query_safe("SELECT resizable_shmem_resize($new_num_entries);",
+ verbose => 0);
+
+ # Old data should still be intact in the (possibly smaller) area
+ my $readable_entries = ($new_num_entries < $old_num_entries) ? $new_num_entries : $old_num_entries;
+ is($session1->query_safe("SELECT resizable_shmem_read($readable_entries, $old_value);",
+ verbose => 0),
+ 't', "old data readable after $label");
+
+ $session2->query_safe("SELECT resizable_shmem_write($new_value);",
+ verbose => 0);
+ is($session1->query_safe("SELECT resizable_shmem_read($new_num_entries, $new_value);",
+ verbose => 0),
+ 't', "new data readable after $label");
+
+ check_shmem_usage($session1, "$label (session 1)", $node);
+ check_shmem_usage($session2, "$label (session 2)", $node);
+
+ $session1->quit;
+ $session2->quit;
+
+ return ($new_num_entries, $new_value);
+}
+
+# Run the full suite of resizable shared memory tests on the given node.
+sub run_resizable_tests
+{
+ my ($node, $initial_entries, $max_entries, $prefix) = @_;
+
+ my $have_resizable_shmem = $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+
+ my $num_entries = $initial_entries;
+
+ # Basic read/write should work on all platforms
+ my $value = 100;
+ $node->safe_psql('postgres', "SELECT resizable_shmem_write($value);");
+ is($node->safe_psql('postgres', "SELECT resizable_shmem_read($num_entries, $value);"),
+ 't', "$prefix: data read after write successful");
+
+ if ($have_resizable_shmem)
+ {
+ # Initial structure state
+ my $session1 = $node->background_psql('postgres');
+ my $session2 = $node->background_psql('postgres');
+
+ $value = 100;
+ # Write and read the initial set of entries.
+ $session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+ is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);",
+ verbose => 0),
+ 't', "$prefix: data read after write successful");
+ check_shmem_usage($session1, "$prefix: initial write (session 1)", $node);
+ check_shmem_usage($session2, "$prefix: initial write (session 2)", $node);
+ $session1->quit;
+ $session2->quit;
+
+ # Verify no other structure is resizable
+ is($node->safe_psql('postgres', "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' AND maximum_size <> minimum_size;"),
+ '0', "$prefix: no other resizable structures");
+
+ # Resize to maximum
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $max_entries, 500, 'resize to maximum');
+
+ # Shrink to 75% of max
+ my $shrink_entries = int($max_entries * 3 / 4);
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $shrink_entries, 999, 'shrinking');
+
+ # Resize to the same size (no-op)
+ ($num_entries, $value) = test_resize($node, $prefix, $num_entries, $value,
+ $num_entries, 1999, 'no-op resize');
+
+ # Test resize failure (attempt to resize beyond max - should fail)
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ ok($ret != 0 || $stderr =~ /ERROR/, "$prefix: Resize beyond maximum fails");
+ }
+ else
+ {
+ # On unsupported platforms, resizing should fail with a clear error
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', "SELECT resizable_shmem_resize($num_entries);");
+ ok($ret != 0, "$prefix: resize fails on unsupported platform");
+ like($stderr, qr/not supported/, "$prefix: resize error mentions not supported");
+ }
+}
+
+### Set up a test node.
+#
+#Configure minimal shared memory so that the resizable_shmem structure dominates
+#and any unexpected increase is easy to detect.
+#
+# Also disable huge pages so that RssShmem and allocated_size are comparable.
+# The latter is already aligned to the default page size.
+###
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+$node->init;
+
+$node->append_conf('postgresql.conf', 'huge_pages = off');
+$node->append_conf('postgresql.conf', 'shared_buffers = 128kB');
+$node->append_conf('postgresql.conf', 'max_connections = 5');
+$node->append_conf('postgresql.conf', 'max_worker_processes = 0');
+$node->append_conf('postgresql.conf', 'max_wal_senders = 0');
+$node->append_conf('postgresql.conf', 'max_prepared_transactions = 0');
+$node->append_conf('postgresql.conf', 'max_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'max_pred_locks_per_transaction = 10');
+$node->append_conf('postgresql.conf', 'wal_buffers = 32kB');
+
+###
+# Test 1: Startup allocation via shared_preload_libraries
+###
+my $startup_initial = 25 * 1024 * 1024;
+my $startup_max = 100 * 1024 * 1024;
+
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = test_shmem');
+$node->append_conf('postgresql.conf', "test_shmem.initial_entries = $startup_initial");
+$node->append_conf('postgresql.conf', "test_shmem.max_entries = $startup_max");
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION test_shmem;');
+run_resizable_tests($node, $startup_initial, $startup_max, 'startup');
+
+###
+# Test 2: Late allocation (loaded after startup, not in shared_preload_libraries).
+# Use much smaller sizes since only ~100KB of shared memory is available for
+# structures allocated after startup.
+###
+my $late_initial = 5 * 1024;
+my $late_max = 12 * 1024;
+
+$node->safe_psql('postgres', qq{
+ ALTER SYSTEM RESET shared_preload_libraries;
+ ALTER SYSTEM SET test_shmem.initial_entries = $late_initial;
+ ALTER SYSTEM SET test_shmem.max_entries = $late_max;
+});
+$node->safe_psql('postgres', 'DROP EXTENSION test_shmem;');
+$node->restart;
+
+$node->safe_psql('postgres', 'CREATE EXTENSION test_shmem;');
+run_resizable_tests($node, $late_initial, $late_max, 'late');
+
+###
+# Test sysv shared memory does not support resizable shmem. Only relevant on
+# platforms that support resizable shmem (HAVE_RESIZABLE_SHMEM), since the
+# module only sets maximum_size in that case.
+###
+my $resizable_shmem_binary = $node->safe_psql('postgres', 'SHOW have_resizable_shared_memory;') eq 'on';
+if ($resizable_shmem_binary)
+{
+ ###
+ # Test 3: Verify that CREATE EXTENSION fails with sysv shared memory
+ # when loaded after startup (not in shared_preload_libraries).
+ ###
+ $node->safe_psql('postgres', 'DROP EXTENSION test_shmem;');
+
+ # Remove settings that would cause the library to auto-load at startup:
+ # shared_preload_libraries and module-prefixed GUCs. ALTER SYSTEM RESET
+ # only affects postgresql.auto.conf, so we must use adjust_conf to remove
+ # from postgresql.conf.
+ $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.conf', 'test_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.conf', 'test_shmem.max_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'shared_preload_libraries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'test_shmem.initial_entries', undef);
+ $node->adjust_conf('postgresql.auto.conf', 'test_shmem.max_entries', undef);
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_memory_type = 'sysv';
+ });
+
+ $node->restart;
+
+ is($node->safe_psql('postgres', 'SHOW have_resizable_shared_memory'), 'off',
+ 'have_resizable_shared_memory is off with sysv');
+
+ my ($ret, $stdout, $stderr) =
+ $node->psql('postgres', 'CREATE EXTENSION test_shmem;');
+ ok($ret != 0, 'CREATE EXTENSION fails with resizable shmem on sysv');
+ like($stderr, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'CREATE EXTENSION error mentions shared_memory_type = mmap requirement');
+
+ ###
+ # Test 4: Verify that resizable structures are also rejected with sysv
+ # shared memory when loaded at startup via shared_preload_libraries.
+ ###
+ $node->safe_psql('postgres', qq{
+ ALTER SYSTEM SET shared_preload_libraries = 'test_shmem';
+ ALTER SYSTEM SET test_shmem.initial_entries = $startup_initial;
+ ALTER SYSTEM SET test_shmem.max_entries = $startup_max;
+ });
+ $node->stop;
+
+ ok(!$node->start(fail_ok => 1),
+ 'server fails to start with resizable shmem on sysv');
+
+ my $log = slurp_file($node->logfile);
+ like($log, qr/resizable shared memory requires shared_memory_type = mmap/,
+ 'log mentions shared_memory_type = mmap requirement');
+}
+
+done_testing();
diff --git a/src/test/modules/test_shmem/test_shmem--1.0.sql b/src/test/modules/test_shmem/test_shmem--1.0.sql
index 2d01fd9256c..a03c90e025b 100644
--- a/src/test/modules/test_shmem/test_shmem--1.0.sql
+++ b/src/test/modules/test_shmem/test_shmem--1.0.sql
@@ -7,3 +7,37 @@
CREATE FUNCTION get_test_shmem_attach_count()
RETURNS pg_catalog.int4 STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION test_shmem_resize_fixed(pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+-- Function to resize the resizable test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value pg_catalog.int4)
+RETURNS pg_catalog.void STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count pg_catalog.int4, entry_value pg_catalog.int4)
+RETURNS pg_catalog.bool STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+-- Function to report memory mapped against the main shared memory segment in
+-- the backend where this function runs.
+CREATE FUNCTION resizable_shmem_usage()
+RETURNS pg_catalog.int8 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS pg_catalog.int4 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_shmem/test_shmem.c b/src/test/modules/test_shmem/test_shmem.c
index 9bd4012b435..72004df4083 100644
--- a/src/test/modules/test_shmem/test_shmem.c
+++ b/src/test/modules/test_shmem/test_shmem.c
@@ -3,9 +3,10 @@
* test_shmem.c
* Helpers to test shmem allocation routines
*
- * Test basic memory allocation in an extension module. One notable feature
- * that is not exercised by any other module in the repository is the
- * allocating (non-DSM) shared memory after postmaster startup.
+ * Test shared memory allocation in an extension module. Notably the module
+ * tests allocating (non-DSM) shared memory after postmaster startup and
+ * resizable shared memory. These two aspects of shared memory are not tested
+ * anywhere else.
*
* Copyright (c) 2020-2026, PostgreSQL Global Development Group
*
@@ -17,24 +18,58 @@
#include "postgres.h"
+#include <limits.h>
+#include <stdio.h>
+
+#include "commands/extension.h"
#include "fmgr.h"
#include "miscadmin.h"
#include "storage/shmem.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
PG_MODULE_MAGIC;
-typedef struct TestShmemData
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+typedef struct TestFixedData
+
{
int value;
bool initialized;
int attach_count;
-} TestShmemData;
+} TestFixedData;
+
+static TestFixedData *FixedShmem;
+
+/*
+ * Resizable shared structure
+ *
+ * The test performs resizing, reads or writes, only one at a time and never
+ * concurrently. Hence, there is no need for locks in the test structure.
+ */
+typedef struct TestResizableData
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
-static TestShmemData *TestShmem;
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableData;
+
+static TestResizableData *ResizableShmem = NULL;
static bool attached_or_initialized = false;
+/* GUC variables controlling the size of the resizable structure */
+static int initial_entries;
+static int max_entries;
+
+/* Whether to use SHMEM_ATTACH_UNKNOWN_SIZE when attaching resizable structure. */
+static bool use_unknown_size = false;
+
static void test_shmem_request(void *arg);
static void test_shmem_init(void *arg);
static void test_shmem_attach(void *arg);
@@ -49,33 +84,66 @@ static const ShmemCallbacks TestShmemCallbacks = {
static void
test_shmem_request(void *arg)
{
+ Size initial_size = add_size(offsetof(TestResizableData, data),
+ mul_size(initial_entries, TEST_ENTRY_SIZE));
+
+/*
+ * Create resizable structure on the platforms which support it. Otherwise create
+ * as a fixed-size structure. Other way would be to conditionally include
+ * .maximum_size in the call to ShmemRequestStruct().
+ */
+#ifdef HAVE_RESIZABLE_SHMEM
+ Size max_size = add_size(offsetof(TestResizableData, data),
+ mul_size(max_entries, TEST_ENTRY_SIZE));
+ Size min_size = offsetof(TestResizableData, data);
+
+#else
+ Size max_size = 0;
+ Size min_size = 0;
+#endif
+
elog(LOG, "test_shmem_request callback called");
+ /* Fixed-size structure */
ShmemRequestStruct(.name = "test_shmem area",
- .size = sizeof(TestShmemData),
- .ptr = (void **) &TestShmem);
+ .size = sizeof(TestFixedData),
+ .ptr = (void **) &FixedShmem);
+
+ /* Resizable structure */
+ ShmemRequestStruct(.name = "resizable_shmem",
+ .size = use_unknown_size ? SHMEM_ATTACH_UNKNOWN_SIZE : initial_size,
+ .minimum_size = min_size,
+ .maximum_size = max_size,
+ .ptr = (void **) &ResizableShmem);
}
static void
test_shmem_init(void *arg)
{
elog(LOG, "init callback called");
- if (TestShmem->initialized)
+
+ if (FixedShmem->initialized)
elog(ERROR, "shmem area already initialized");
- TestShmem->initialized = true;
+ FixedShmem->initialized = true;
if (attached_or_initialized)
elog(ERROR, "attach or initialize already called in this process");
attached_or_initialized = true;
+
+ /* Resizable structure should have been already allocated. Initialize it. */
+ Assert(ResizableShmem != NULL);
+
+ ResizableShmem->num_entries = initial_entries;
+ memset(ResizableShmem->data, 0, mul_size(initial_entries, TEST_ENTRY_SIZE));
}
static void
test_shmem_attach(void *arg)
{
elog(LOG, "test_shmem_attach callback called");
- if (!TestShmem->initialized)
+ if (!FixedShmem->initialized)
elog(ERROR, "shmem area not yet initialized");
- TestShmem->attach_count++;
+ FixedShmem->attach_count++;
if (attached_or_initialized)
elog(ERROR, "attach or initialize already called in this process");
@@ -85,17 +153,231 @@ test_shmem_attach(void *arg)
void
_PG_init(void)
{
+ int guc_context;
+
elog(LOG, "test_shmem module's _PG_init called");
+
+ /*
+ * Use PGC_POSTMASTER when loaded at startup so the values are fixed once
+ * the shared memory segment is created. When loaded after startup
+ * PGC_POSTMASTER is not allowed, so we use PGC_SIGHUP instead. Although
+ * we do not intend to change these values at config reload, PGC_SIGHUP is
+ * the least permissive context that allows defining the GUC after startup
+ * and still prevents it from being changed via SET.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ guc_context = PGC_POSTMASTER;
+ else
+ guc_context = PGC_SIGHUP;
+
+ /*
+ * Set defaults very low so that the structure can be loaded after startup
+ * as well when there's only a 100KB extra memory available.
+ */
+ DefineCustomIntVariable("test_shmem.initial_entries",
+ "Initial number of entries in the test structure.",
+ NULL,
+ &initial_entries,
+ 100, /* ~ 400 bytes */
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ DefineCustomIntVariable("test_shmem.max_entries",
+ "Maximum number of entries in the test structure.",
+ NULL,
+ &max_entries,
+ 200, /* ~ 800 bytes */
+ 1,
+ INT_MAX,
+ guc_context,
+ 0,
+ NULL, NULL, NULL);
+
+ /*
+ * When loaded after startup by a backend that is not creating the
+ * extension, the shared memory might have been resized to a size other
+ * than the initial size. Use SHMEM_ATTACH_UNKNOWN_SIZE to attach without
+ * knowing the exact size.
+ */
+ if (!process_shared_preload_libraries_in_progress && !creating_extension)
+ use_unknown_size = true;
+
RegisterShmemCallbacks(&TestShmemCallbacks);
}
+/* Fixed-size structure APIs */
PG_FUNCTION_INFO_V1(get_test_shmem_attach_count);
Datum
get_test_shmem_attach_count(PG_FUNCTION_ARGS)
{
if (!attached_or_initialized)
elog(ERROR, "shmem area not attached or initialized in this process");
- if (!TestShmem->initialized)
+ if (!FixedShmem->initialized)
elog(ERROR, "shmem area not yet initialized");
- PG_RETURN_INT32(TestShmem->attach_count);
+ PG_RETURN_INT32(FixedShmem->attach_count);
+}
+
+/*
+ * Attempt to resize the fixed-size shared memory structure. This should
+ * fail because the structure was not allocated with a maximum_size.
+ */
+PG_FUNCTION_INFO_V1(test_shmem_resize_fixed);
+Datum
+test_shmem_resize_fixed(PG_FUNCTION_ARGS)
+{
+ int32 new_size = PG_GETARG_INT32(0);
+
+ ShmemResizeStruct("test_shmem area", new_size);
+ PG_RETURN_VOID();
+}
+
+/* Resizable structure APIs */
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ *
+ * On the plaforms which do not support resizable shared memory,
+ * ShmemResizeStruct() will raise an error, so this function will fail if the
+ * caller tries to resize the structure.
+ */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!ResizableShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = add_size(offsetof(TestResizableData, data),
+ mul_size(new_entries, TEST_ENTRY_SIZE));
+ ShmemResizeStruct("resizable_shmem", new_size);
+ ResizableShmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+}
+
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!ResizableShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < ResizableShmem->num_entries; i++)
+ ResizableShmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (ResizableShmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ if (entry_count < 0 || entry_count > ResizableShmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, ResizableShmem->num_entries)));
+
+ for (i = 0; i < entry_count; i++)
+ {
+ if (ResizableShmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Return the memory mapped against the main shared memory segment in this
+ * backend.
+ *
+ * The VMA containing our resizable_shmem pointer is used to determine the main
+ * memory segment. RSS + Swap (in bytes) for that VMS from /proc/self/smaps is
+ * returned.
+ */
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+ FILE *f;
+ char line[256];
+ int64 rss_kb = -1;
+ int64 swap_kb = -1;
+ uintptr_t target = (uintptr_t) ResizableShmem;
+ bool in_target_vma = false;
+ size_t result;
+
+ f = fopen("/proc/self/smaps", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/smaps: %m")));
+
+ while (fgets(line, sizeof(line), f) != NULL)
+ {
+ unsigned long start;
+ unsigned long end;
+
+ if (sscanf(line, "%lx-%lx", &start, &end) == 2)
+ {
+ in_target_vma = (target >= start && target < end);
+ }
+ else if (in_target_vma)
+ {
+ if (rss_kb == -1)
+ sscanf(line, "Rss: %ld kB", &rss_kb);
+ if (swap_kb == -1)
+ sscanf(line, "Swap: %ld kB", &swap_kb);
+ if (rss_kb >= 0 && swap_kb >= 0)
+ break;
+ }
+ }
+
+ fclose(f);
+
+ result = rss_kb >= 0 ? mul_size(rss_kb, 1024) : 0;
+ result = add_size(result, swap_kb >= 0 ? mul_size(swap_kb, 1024) : 0);
+
+ PG_RETURN_INT64(result);
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a65a5bf0c4f..a882d799133 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,11 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ minimum_size,
+ maximum_size,
+ reserved_space
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, minimum_size, maximum_size, reserved_space);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 637c669a146..c70bee8d837 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3153,7 +3153,8 @@ TestDSMRegistryHashEntry
TestDSMRegistryStruct
TestDecodingData
TestDecodingTxnData
-TestShmemData
+TestFixedData
+TestResizableData
TestSpec
TestValueType
TextFreq
base-commit: 55890a919454a2165031a04b175ca92e3ed70e69
--
2.34.1
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 19:38 Matthias van de Meent <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Matthias van de Meent @ 2026-04-07 19:38 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, 7 Apr 2026 at 16:47, Ashutosh Bapat
<[email protected]> wrote:
>
> On Tue, Apr 7, 2026 at 3:36 PM Ashutosh Bapat
> <[email protected]> wrote:
> >
> > On Mon, Apr 6, 2026 at 7:23 PM Ashutosh Bapat
> > <[email protected]> wrote:
> > >
> > > I have kept these two patches separate from the main patch so that I
> > > can remove them if others feel they are not worth including in the
> > > feature.
> >
> > Here are patches rebased on the latest HEAD. No conflicts just rebase.
> >
> > Here are differences from the previous patchset.
> >
> > o. There are two patches in this patchset now. a. 0001 which supports
> > resizable shared memory and is equivalent to 0001 + 0002 + 0004 + 0005
> > from the previous patchset. b. 0002 which is 0006 from the previous
> > patchset and adds support for protecting resizable shared memory
> > structures. 0003, which added diagnostics to investigate CFBot
> > failure, from the previous patchset is not required anymore since all
> > tests pass with CFBot.
> >
> > o. I have merged 0002 into 0001 from the previous patchset since with
> > that patch all platforms are green on CFBot. The resizable shared
> > memory test now uses /proc/self/smaps instead of /proc/self/status to
> > find the amount of memory allocated in the main shared memory segment
> > of PostgreSQL.
> >
> > o. Merged 0004, which supported minimum_size, into 0001. Minimum_size
> > would be useful to protect against accidental shrinkage of the
> > resizable structures. It will help additional support for minimum
> > sizes of GUCs like shared_buffers. It also makes it easy and intuitive
> > to distinguish between fixed-size and resizable structures, and will
> > be useful to find the minimum size of the shared memory segment.
I was thinking more along the lines of attached (incremental) patch
0003 for min/max sizing. I'd say it has a slightly more natural API,
but YMMV.
-----
I also noticed that it's probably not correct to "just" check and
complain about the size of a resizable shmem segment when you attach:
Without coordination about which startup size the shmem segment should
have, how could you get the current size state correct? And
cross-process coordination of size information before shmem is
attached is not really possible, not when you may have to deal with a
very slow to start backend.
----
Attached also 0004, which makes some small adjustments to shmem.c's
resize checks.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
Attachments:
[application/octet-stream] v20260407-0004-Some-check-simplification-deduplication.nocfbot.patch (2.8K, 2-v20260407-0004-Some-check-simplification-deduplication.nocfbot.patch)
download | inline diff:
From 5668cae79da67fee2eab6c5ce480fed521446680 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <[email protected]>
Date: Tue, 7 Apr 2026 21:21:57 +0200
Subject: [PATCH v20260407 4/4] Some check simplification/deduplication.
---
src/backend/storage/ipc/shmem.c | 60 ++++++++++-----------------------
1 file changed, 17 insertions(+), 43 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 12de6344cac..540a51b50e5 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -704,49 +704,23 @@ AttachShmemIndexEntry(ShmemRequest *request, bool missing_ok)
return false;
}
- /* Check that the sizes in the index match the request. */
- if (index_entry->size != request->options->size &&
- request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
- {
- ereport(ERROR,
- (errmsg("shared memory struct \"%s\" was created with"
- " different size: existing %zu, requested %zu",
- name, index_entry->size, request->options->size)));
- }
-
- /*
- * For resizable structures, also check that minimum_size and maximum_size
- * match. For fixed-size structures, these are derived (set to size) in
- * the index entry and not meaningful in the request.
- */
- if (request->options->maximum_size != 0)
- {
- if (index_entry->minimum_size != request->options->minimum_size &&
- request->options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
- {
- ereport(ERROR,
- (errmsg("shared memory struct \"%s\" was created with"
- " different minimum_size: existing %zu, requested %zu",
- name, index_entry->minimum_size,
- request->options->minimum_size)));
- }
-
- if (index_entry->maximum_size != request->options->maximum_size &&
- request->options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
- {
- ereport(ERROR,
- (errmsg("shared memory struct \"%s\" was created with"
- " different maximum_size: existing %zu, requested %zu",
- name, index_entry->maximum_size,
- request->options->maximum_size)));
- }
- }
- else
- {
- if (index_entry->minimum_size != index_entry->maximum_size)
- elog(ERROR, "shared memory struct \"%s\" was created as resizable, but requested as fixed-size",
- name);
- }
+#define CHECK_SIZE(size) \
+do { \
+ /* Check that the sizes in the index match the request. */ \
+ if (request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE && \
+ index_entry->size != request->options->size) \
+ { \
+ ereport(ERROR, \
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different %s: existing %zu, requested %zu", \
+ name, CppAsString(size), index_entry->size, \
+ request->options->size))); \
+ } \
+} while (false)
+
+ CHECK_SIZE(size);
+ CHECK_SIZE(minimum_size);
+ CHECK_SIZE(maximum_size);
/*
* Re-establish the caller's pointer variable, or do other actions to
--
2.50.1 (Apple Git-155)
[application/octet-stream] v20260407-0003-Various-adjustments-to-resizable-shmem-acc.nocfbot.patch (10.8K, 3-v20260407-0003-Various-adjustments-to-resizable-shmem-acc.nocfbot.patch)
download | inline diff:
From 62d7a2a841494c12d1814eae5ab5788dfeb9d5c1 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <[email protected]>
Date: Tue, 7 Apr 2026 20:59:36 +0200
Subject: [PATCH v20260407 3/4] Various adjustments to resizable shmem
accounting
Instead of continuously falling back onto .size when .maximum_size
is 0, update the copied(!) options' .maximum_size with the value
of .size. This saves a compare operation per SHMEM_REQUEST_SPACE_SIZE()
evaluation, and simplifies which field to access for bound checks.
Similarly, .minimum_size is updated with an input of 0 meaning to
default to .size. A substitute Zero value of SHMEM_RESIZE_TO_ZERO
is used to allow users to resize the segment's memory usage to zero
bytes used.
It also reorders and deduplicates some error condition checks.
---
src/backend/storage/ipc/shmem.c | 111 ++++++++++++++++++--------------
src/include/storage/shmem.h | 20 +++---
2 files changed, 75 insertions(+), 56 deletions(-)
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 61808c7a8e5..12de6344cac 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -108,14 +108,19 @@
* In order to allocate resizable shared memory structures, set
* ShmemRequestStructOpts::maximum_size to the maximum size that the structure
* can grow to. The address space for the maximum size will be reserved at
- * startup, but memory is allocated or freed as the structure grows or shrinks
- * respectively. ShmemRequestStructOpts::size should be set to the initial size
- * of the structure, which is the amount of memory allocated at the startup.
- * Optionally, ShmemRequestStructOpts::minimum_size can be set to the minimum
- * size that the structure can shrink to. After startup, the structure can be
- * resized by calling ShmemResizeStruct() by passing it the ShmemStructDesc for
- * the structure and the new size. ShmemResizeStruct() enforces that the new
- * size is within [minimum_size, maximum_size].
+ * startup, whilst the backing memory is allocated or freed as the structure
+ * grows or shrinks respectively. ShmemRequestStructOpts::size should be set
+ * to the initial size of the structure at startup. Optionally,
+ * ShmemRequestStructOpts::minimum_size can be set to the minimum size that
+ * the structure can shrink to. A sentinel value of SHMEM_RESIZE_TO_ZERO can
+ * be used used to indicate the struct can scale its memory usage down to 0
+ * bytes; the natural 0 is assumed to be uninitialized and so will cause the
+ * minimum size to default to the value of ShmemRequestStructOpts::size, like
+ * ShmemRequestStructOpts::maximum_size.
+ * After startup, the structure can be resized by calling ShmemResizeStruct()
+ * by passing it the ShmemStructDesc for the structure and the new size.
+ * ShmemResizeStruct() enforces that the new size is within
+ * [minimum_size, maximum_size].
*
* While resizable structures can be created after the startup, the memory
* available for them is quite limited.
@@ -192,8 +197,7 @@ typedef struct
* resizable shmem, the maximum_size is ensured to be 0 i.e. all the structures
* are treated as fixed-size structures.
*/
-#define SHMEM_REQUEST_SPACE_SIZE(request) \
- ((request)->options->maximum_size > 0 ? (request)->options->maximum_size : (request)->options->size)
+#define SHMEM_REQUEST_SPACE_SIZE(request) ((request)->options->maximum_size)
static List *pending_shmem_requests;
@@ -371,70 +375,81 @@ ShmemRequestInternal(ShmemStructOpts *options, ShmemRequestKind kind)
{
ShmemRequest *request;
+ /* Check that we're in the right state */
+ if (shmem_request_state != SRS_REQUESTING)
+ elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+
/* Check the options */
if (options->name == NULL)
elog(ERROR, "shared memory request is missing 'name' option");
+ /*
+ * Sanitize the options input by populating min/max with their actual values.
+ *
+ * Note that minimum_size's zero-initialized value "not specified" conflicts
+ * with a natural value for "resize to zero", so "resize to zero" has its own
+ * sentinel value with SHMEM_RESIZE_TO_ZERO.
+ */
+ if (options->maximum_size == 0)
+ {
+ options->maximum_size = options->size;
+ Assert(options->minimum_size == 0);
+ }
+
+ if (options->minimum_size == 0)
+ options->minimum_size = options->size;
+ else if (options->minimum_size == SHMEM_RESIZE_TO_ZERO)
+ options->minimum_size = 0;
+
+ /* resizing shmem segment */
+ if (options->maximum_size != options->minimum_size)
+ {
#ifndef HAVE_RESIZABLE_SHMEM
- if (options->maximum_size > 0)
elog(ERROR, "resizable shared memory is not supported on this platform");
#else
- if (options->maximum_size > 0 && shared_memory_type != SHMEM_TYPE_MMAP)
- elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
+ if (shared_memory_type != SHMEM_TYPE_MMAP)
+ elog(ERROR, "resizable shared memory requires shared_memory_type = mmap");
#endif
-
- if (IsUnderPostmaster)
- {
- if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
- elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
- options->size, options->name);
- if (options->minimum_size < 0 && options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
- elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
- options->minimum_size, options->name);
- if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
- elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
- options->maximum_size, options->name);
}
- else
+
+ if (!IsUnderPostmaster)
{
if (options->size == SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
- if (options->size <= 0)
- elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
- options->size, options->name);
if (options->minimum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
- if (options->minimum_size < 0)
- elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
- options->minimum_size, options->name);
if (options->maximum_size == SHMEM_ATTACH_UNKNOWN_SIZE)
elog(ERROR, "SHMEM_ATTACH_UNKNOWN_SIZE cannot be used during startup");
- if (options->maximum_size < 0)
- elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
- options->maximum_size, options->name);
}
+ if (options->size <= 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid size %zd for shared memory request for \"%s\"",
+ options->size, options->name);
+ if (options->minimum_size < 0 && options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid minimum_size %zd for shared memory request for \"%s\"",
+ options->minimum_size, options->name);
+ if (options->maximum_size < 0 && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE)
+ elog(ERROR, "invalid maximum_size %zd for shared memory request for \"%s\"",
+ options->maximum_size, options->name);
+
if (options->alignment != 0 && pg_nextpower2_size_t(options->alignment) != options->alignment)
elog(ERROR, "invalid alignment %zu for shared memory request for \"%s\"",
options->alignment, options->name);
- if (options->minimum_size > 0 && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
- options->minimum_size > options->size)
- elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to size (%zd)",
- options->name, options->minimum_size, options->size);
-
- if (options->maximum_size > 0 && options->size > options->maximum_size)
- elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
- options->name, options->maximum_size, options->size);
-
- if (options->minimum_size > 0 && options->maximum_size > 0 &&
+ if (options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE && options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE &&
options->minimum_size > options->maximum_size)
elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to maximum size (%zd)",
options->name, options->minimum_size, options->maximum_size);
- /* Check that we're in the right state */
- if (shmem_request_state != SRS_REQUESTING)
- elog(ERROR, "ShmemRequestStruct can only be called from a shmem_request callback");
+ if (options->minimum_size != SHMEM_ATTACH_UNKNOWN_SIZE && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
+ options->minimum_size > options->size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have minimum size (%zd) less than or equal to size (%zd)",
+ options->name, options->minimum_size, options->size);
+
+ if (options->maximum_size != SHMEM_ATTACH_UNKNOWN_SIZE && options->size != SHMEM_ATTACH_UNKNOWN_SIZE &&
+ options->size > options->maximum_size)
+ elog(ERROR, "resizable shared memory structure \"%s\" should have size (%zd) less than or equal to maximum size (%zd)",
+ options->name, options->size, options->maximum_size);
/* Check that it's not already registered in this process */
foreach_ptr(ShmemRequest, existing, pending_shmem_requests)
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index f8ddb0dd7c0..2682704b3ef 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -58,18 +58,21 @@ typedef struct ShmemStructOpts
size_t alignment;
/*
- * Minimum size this structure can shrink to. Should be set to 0 for
- * fixed-size structures.
+ * Minimum size this structure can shrink to.
+ * When initialized to 0, it defaults to the value of the .size field; use
+ * SHMEM_RESIZE_TO_ZERO instead if you want your shmem allocation to be
+ * able to shrink to 'no memory usage'.
*/
ssize_t minimum_size;
/*
- * Maximum size this structure can grow upto in future. The memory is not
- * allocated right away but the corresponding address space is reserved so
- * that memory can be mapped to it when the structure grows. Typically
- * should be used for large resizable structures which need several pages
- * worth of contiguous memory. Should be set to 0 for fixed-size
- * structures.
+ * Maximum size this structure can grow up to in the future. The memory not
+ * required for .size is not allocated right away, but the corresponding
+ * address space is reserved so that memory can be mapped to it when the
+ * structure grows. Typically, this should be used for large resizable
+ * structures which need several pages worth of contiguous memory.
+ *
+ * When set to 0, it defaults to the value of the .size field.
*/
ssize_t maximum_size;
@@ -83,6 +86,7 @@ typedef struct ShmemStructOpts
} ShmemStructOpts;
#define SHMEM_ATTACH_UNKNOWN_SIZE (-1)
+#define SHMEM_RESIZE_TO_ZERO (-2)
/*
* Options for ShmemRequestHash()
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 19:48 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
1 sibling, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-07 19:48 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 07/04/2026 17:46, Ashutosh Bapat wrote:
> Here are patches with the test modules merged.
>
> The merged module looks a bit rough to me and so does 0006. For
> example, I am not sure whether calling ShmemStructProtect() from
> init_fn is a good idea. See [1] for example. But init_fn is the last
> chance for the subsystem to touch and setup the resizable structure
> before it's opened to the wild. So, in the current infrastructure, I
> don't see any better place to call ShmemStructProtect() either. If you
> run tests after applying patch 0006, you will need to apply patch
> attached to [1] as well; otherwise the test will hang.
I haven't really looked at these resizeable patches before, except for
how they would fit with the new shmem allocation API, so I have some
very basic, high-level design questions:
> +/*
> + * ShmemResizeStruct() --- resize a resizable shared memory structure.
> + *
> + * The new size must be within [minimum_size, maximum_size]. If the structure
> + * is being shrunk, the memory pages that are no longer needed are freed. If
> + * the structure is being expanded, the memory pages that are needed for the
> + * new size are allocated. See EstimateAllocatedSize() for explanation of which
> + * pages are allocated for a resizable structure.
> + */
> +void
> +ShmemResizeStruct(const char *name, Size new_size)
This interface only allows shrinking and growing the allocated region at
the end, but the underlying mechanism is madvise(MADV_REMOVE) and
madvise(MADV_WRITE_POPULATE), which supports also "punching holes", i.e.
freeing memory in the middle of a region. Do we gain anything by
restricting ourselves to changing the size at the end? It seems to me
that it could be handy to punch holes for some use cases.
What's the portability story? I understand that this is Linux-only at
the moment, but what platforms can we support in the future, and what's
the effort? I think BSD's have similar capabilities with plain mmap()
and MADV_FREE if I read the man pages right. What about macOS and
Windows? This doesn't necessarily need to be fully portable, if some
OS's don't have the capabilities we need, but would be nice to know
what's possible.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-07 20:09 Andres Freund <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Andres Freund @ 2026-04-07 20:09 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers; [email protected]
Hi,
On 2026-04-07 22:48:17 +0300, Heikki Linnakangas wrote:
> > +/*
> > + * ShmemResizeStruct() --- resize a resizable shared memory structure.
> > + *
> > + * The new size must be within [minimum_size, maximum_size]. If the structure
> > + * is being shrunk, the memory pages that are no longer needed are freed. If
> > + * the structure is being expanded, the memory pages that are needed for the
> > + * new size are allocated. See EstimateAllocatedSize() for explanation of which
> > + * pages are allocated for a resizable structure.
> > + */
> > +void
> > +ShmemResizeStruct(const char *name, Size new_size)
>
> This interface only allows shrinking and growing the allocated region at the
> end, but the underlying mechanism is madvise(MADV_REMOVE) and
> madvise(MADV_WRITE_POPULATE), which supports also "punching holes", i.e.
> freeing memory in the middle of a region. Do we gain anything by restricting
> ourselves to changing the size at the end? It seems to me that it could be
> handy to punch holes for some use cases.
Agreed. The hard part may be the "communication" with the user about how
granular the punches can be. Because that will depend on things like
huge_pages, huge_page_size and may depend on what alignment you happened to
get.
> What's the portability story? I understand that this is Linux-only at the
> moment, but what platforms can we support in the future, and what's the
> effort? I think BSD's have similar capabilities with plain mmap() and
> MADV_FREE if I read the man pages right.
At least linux' MADV_FREE is only for private mappings. It's not clear in at
least freebsd's man page, but the described use case makes me suspect it may
be similar there.
> What about macOS and Windows? This doesn't necessarily need to be fully
> portable, if some OS's don't have the capabilities we need, but would be
> nice to know what's possible.
Looks like windows has OfferVirtualMemory
https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-offervirtualmemory
but it's not clear to me if it actually does what we need when multiple
processes are attached.
I suspect it's going to be a lot easier once we're threaded... The reason I
am ok with doing resizing this way before threading is because it's
architecturally pretty similar to what you'd want to do once threaded, so it's
not a huge dead end. But I'm doubtful we'll find facilities that allow this
across processes in all operating systems...
Greetings,
Andres Freund
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-08 03:49 Ashutosh Bapat <[email protected]>
parent: Matthias van de Meent <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-08 03:49 UTC (permalink / raw)
To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Wed, Apr 8, 2026 at 1:08 AM Matthias van de Meent
<[email protected]> wrote:
>
> On Tue, 7 Apr 2026 at 16:47, Ashutosh Bapat
> <[email protected]> wrote:
> >
> > On Tue, Apr 7, 2026 at 3:36 PM Ashutosh Bapat
> > <[email protected]> wrote:
> > >
> > > On Mon, Apr 6, 2026 at 7:23 PM Ashutosh Bapat
> > > <[email protected]> wrote:
> > > >
> > > > I have kept these two patches separate from the main patch so that I
> > > > can remove them if others feel they are not worth including in the
> > > > feature.
> > >
> > > Here are patches rebased on the latest HEAD. No conflicts just rebase.
> > >
> > > Here are differences from the previous patchset.
> > >
> > > o. There are two patches in this patchset now. a. 0001 which supports
> > > resizable shared memory and is equivalent to 0001 + 0002 + 0004 + 0005
> > > from the previous patchset. b. 0002 which is 0006 from the previous
> > > patchset and adds support for protecting resizable shared memory
> > > structures. 0003, which added diagnostics to investigate CFBot
> > > failure, from the previous patchset is not required anymore since all
> > > tests pass with CFBot.
> > >
> > > o. I have merged 0002 into 0001 from the previous patchset since with
> > > that patch all platforms are green on CFBot. The resizable shared
> > > memory test now uses /proc/self/smaps instead of /proc/self/status to
> > > find the amount of memory allocated in the main shared memory segment
> > > of PostgreSQL.
> > >
> > > o. Merged 0004, which supported minimum_size, into 0001. Minimum_size
> > > would be useful to protect against accidental shrinkage of the
> > > resizable structures. It will help additional support for minimum
> > > sizes of GUCs like shared_buffers. It also makes it easy and intuitive
> > > to distinguish between fixed-size and resizable structures, and will
> > > be useful to find the minimum size of the shared memory segment.
>
> I was thinking more along the lines of attached (incremental) patch
> 0003 for min/max sizing. I'd say it has a slightly more natural API,
> but YMMV.
>
Thanks for the proposal. There are some advantages and disadvantages
of that approach. Let me explain.
minimum_size = 0 seems more straightforward to me compared to the
introduction of SHMEM_RESIZE_TO_ZERO - a value other than 0 to mean 0.
That's confusing.
The thought to modify the options in place did cross my mind and I
started going that route. But soon realized that a. option is a caller
structure which is not designed to scribble upon, b. it is saved as a
request and used later. By scribbling upon it, we lose the intent of
the original request, thus the saved request may be susceptible to a
different interpretation in future. I would like to avoid scribbling
as much as possible. The code after scribbling doesn't look materially
improved than earlier.
I like the error handling refactoring, but need to pay close attention
to the details. I tried something similar that didn't work in all the
cases. I will try your changes in the next version.
0004 actually changes the error message we throw when the request is
opposite of the existing structure. Is that intentional? But I guess
some of it can be absorbed to simplify the code here. The macro
definition is confusing.
+#define CHECK_SIZE(size) \
+do { \
+ /* Check that the sizes in the index match the request. */ \
+ if (request->options->size != SHMEM_ATTACH_UNKNOWN_SIZE && \
+ index_entry->size != request->options->size) \
+ { \
+ ereport(ERROR, \
+ (errmsg("shared memory struct \"%s\" was created with" \
+ " different %s: existing %zu, requested %zu", \
+ name, CppAsString(size), index_entry->size, \
+ request->options->size))); \
+ } \
+} while (false)
Ideally size here should be in paranthesis. Its easy to confuse
request->options->size to mean request->options->size when it actually
means request->options->{maximum/minimum}_size. Is that right?
Possibly a static inline function where we pass corresponding members
of request->options and index_entry?
> -----
>
> I also noticed that it's probably not correct to "just" check and
> complain about the size of a resizable shmem segment when you attach:
> Without coordination about which startup size the shmem segment should
> have, how could you get the current size state correct? And
> cross-process coordination of size information before shmem is
> attached is not really possible, not when you may have to deal with a
> very slow to start backend.
SHMEM_ATTACH_UNKNOWN_SIZE can be used there. test_shmem module already
uses it that way.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-08 05:20 Ashutosh Bapat <[email protected]>
parent: Andres Freund <[email protected]>
0 siblings, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-08 05:20 UTC (permalink / raw)
To: Andres Freund <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; Robert Haas <[email protected]>; pgsql-hackers; [email protected]
On Wed, Apr 8, 2026 at 1:39 AM Andres Freund <[email protected]> wrote:
>
> Hi,
>
> On 2026-04-07 22:48:17 +0300, Heikki Linnakangas wrote:
> > > +/*
> > > + * ShmemResizeStruct() --- resize a resizable shared memory structure.
> > > + *
> > > + * The new size must be within [minimum_size, maximum_size]. If the structure
> > > + * is being shrunk, the memory pages that are no longer needed are freed. If
> > > + * the structure is being expanded, the memory pages that are needed for the
> > > + * new size are allocated. See EstimateAllocatedSize() for explanation of which
> > > + * pages are allocated for a resizable structure.
> > > + */
> > > +void
> > > +ShmemResizeStruct(const char *name, Size new_size)
> >
> > This interface only allows shrinking and growing the allocated region at the
> > end, but the underlying mechanism is madvise(MADV_REMOVE) and
> > madvise(MADV_WRITE_POPULATE), which supports also "punching holes", i.e.
> > freeing memory in the middle of a region. Do we gain anything by restricting
> > ourselves to changing the size at the end? It seems to me that it could be
> > handy to punch holes for some use cases.
>
> Agreed. The hard part may be the "communication" with the user about how
> granular the punches can be. Because that will depend on things like
> huge_pages, huge_page_size and may depend on what alignment you happened to
> get.
>
We can extend it that way if there is a valid usecase. For now I kept
it simple for two reasons:
1. Buffer manager structures shrink and expand only at the end right
now. Longer note on buffer lookup table later. This effort started
with buffer resizing and didn't want to expand scope more than what's
needed.
2. Not all the approaches we tried to implement resizable shared
memory have the facility to free memory in the middle. Usually they
have a facility to shrink or expand at the end. If we offer ability to
free memory in the middle based on facilities on one platform, we will
face big hurdles when supporting other platforms. I think it's better
to avoid it when it's not needed.
Buffer lookup table is fixed. It may benefit from punching holes in
the middle if we can somehow get pages worth of free entries together
somewhere in the middle. First it's not easy to perform such
compaction. But even if implement compaction, we can collect those
entries at the end instead of in the middle; the current API will
still be useful.
Is there any other usecase you are envisioning? I also think that it
will be better to introduce a new
ShmemFreeStructPart()/ShmemAllocStructPart() instead of the current
ShmemResizeStruct().
>
> > What's the portability story? I understand that this is Linux-only at the
> > moment, but what platforms can we support in the future, and what's the
> > effort? I think BSD's have similar capabilities with plain mmap() and
> > MADV_FREE if I read the man pages right.
>
> At least linux' MADV_FREE is only for private mappings. It's not clear in at
> least freebsd's man page, but the described use case makes me suspect it may
> be similar there.
>
looks so. FreeBSD also has fallocate with PUNCH_HOLES. We could use it
with fd created using memfd_create() on .and it will need
memfd_create(). I haven't checked whether that works.
>
> > What about macOS and Windows? This doesn't necessarily need to be fully
> > portable, if some OS's don't have the capabilities we need, but would be
> > nice to know what's possible.
>
> Looks like windows has OfferVirtualMemory
> https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-offervirtualmemory
> but it's not clear to me if it actually does what we need when multiple
> processes are attached.
>
Those APIs look similar to madvise+ MADV_REMOVE/MADV_WRITE_POPULATE,
with specific and cleaner interface. At least worth a try.
> I suspect it's going to be a lot easier once we're threaded... The reason I
> am ok with doing resizing this way before threading is because it's
> architecturally pretty similar to what you'd want to do once threaded, so it's
> not a huge dead end. But I'm doubtful we'll find facilities that allow this
> across processes in all operating systems...
check
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-21 07:40 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Heikki Linnakangas @ 2026-04-21 07:40 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 07/04/2026 17:19, Ashutosh Bapat wrote:
> Hi Heikki,
> CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
> init_fn/attach_fn callbacks. That looks wrong. Before this commit,
> init or attach code was not run with the lock held. Any reason the
> lock is held while calling init and attach callbacks. Since these
> function can come from extensions, we don't have control on what goes
> in those functions, and thus looks problematic. Further, it will
> serialize all the attach_fn executions across backends, since each
> will be run under the lock.
This was intentional, I added a note in the docs about it:
When <function>RegisterShmemCallbacks()</function> is called after
startup, it will immediately call the appropriate callbacks,
depending
on whether the requested memory areas were already initialized by
another backend. The callbacks will be called while holding an
internal
lock, which prevents concurrent two backends from initializing the
memory area concurrently.
That "internal lock" is ShmemIndexLock. I piggybacked on that since the
code needs to acquire it anyway for the hash table lookups.
With the old ShmemInitStruct() interface, extensions needed to do the
locking themselves, usually by holding AddinShmemInitLock.
(Now that I read that again, the grammar on the last sentence sounds
awkward...)
> In my case, the init_fn was performing ShmemIndex lookup which
> deadlocked. It's questionable whether init function should lookup
> ShmemIndex but, it's not something that needs to be prohibited
> either.
Yeah I'm curious what the use case is. We could easily introduce another
lock or reuse AddinShmemInitLock for this.
- Heikki
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-04-21 16:05 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
0 siblings, 1 reply; 75+ messages in thread
From: Ashutosh Bapat @ 2026-04-21 16:05 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Tue, Apr 21, 2026 at 1:10 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 07/04/2026 17:19, Ashutosh Bapat wrote:
> > Hi Heikki,
> > CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
> > init_fn/attach_fn callbacks. That looks wrong. Before this commit,
> > init or attach code was not run with the lock held. Any reason the
> > lock is held while calling init and attach callbacks. Since these
> > function can come from extensions, we don't have control on what goes
> > in those functions, and thus looks problematic. Further, it will
> > serialize all the attach_fn executions across backends, since each
> > will be run under the lock.
>
> This was intentional, I added a note in the docs about it:
>
> When <function>RegisterShmemCallbacks()</function> is called after
> startup, it will immediately call the appropriate callbacks,
> depending
> on whether the requested memory areas were already initialized by
> another backend. The callbacks will be called while holding an
> internal
> lock, which prevents concurrent two backends from initializing the
> memory area concurrently.
>
> That "internal lock" is ShmemIndexLock. I piggybacked on that since the
> code needs to acquire it anyway for the hash table lookups.
>
I had read this part, but didn't realize it's ShmemIndexLock. The
document and the code are placed far apart and the comments in the
code do not help connecting these two. The comment before
LWLockAcquire() call doesn't say anything about init functions.
/* Hold ShmemIndexLock while we allocate all the shmem entries */
> With the old ShmemInitStruct() interface, extensions needed to do the
> locking themselves, usually by holding AddinShmemInitLock.
>
> (Now that I read that again, the grammar on the last sentence sounds
> awkward...)
>
Given that the init_fn is called in only one backend which requests
the structures first, do we need a lock?
> > In my case, the init_fn was performing ShmemIndex lookup which
> > deadlocked. It's questionable whether init function should lookup
> > ShmemIndex but, it's not something that needs to be prohibited
> > either.
> Yeah I'm curious what the use case is. We could easily introduce another
> lock or reuse AddinShmemInitLock for this.
>
In case of resizable shared memory structures, I was adding mprotect
to make sure that the part of the shared address space which is
reserved but not used is protected from inadvertent access. The
mprotect is wrapped in a shmem API which fetches the ShmemIndex entry
of the shared structure, figures out the part of the address space to
protect using maximum_size and current size and calls mprotect
appropriately. To fetch the ShmemIndex entry it acquires a ShmemIndex
lock. The shmem API was supposed to be called from init_fn() and
attach_fn() to protect the address spaces as soon as the structure is
attached to. See patches attached to [1] for code.
[1] https://www.postgresql.org/message-id/[email protected]...
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-06-12 15:37 Heikki Linnakangas <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 2 replies; 75+ messages in thread
From: Heikki Linnakangas @ 2026-06-12 15:37 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On 21/04/2026 19:05, Ashutosh Bapat wrote:
> On Tue, Apr 21, 2026 at 1:10 PM Heikki Linnakangas <[email protected]> wrote:
>>
>> On 07/04/2026 17:19, Ashutosh Bapat wrote:
>>> Hi Heikki,
>>> CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
>>> init_fn/attach_fn callbacks. That looks wrong. Before this commit,
>>> init or attach code was not run with the lock held. Any reason the
>>> lock is held while calling init and attach callbacks. Since these
>>> function can come from extensions, we don't have control on what goes
>>> in those functions, and thus looks problematic. Further, it will
>>> serialize all the attach_fn executions across backends, since each
>>> will be run under the lock.
>>
>> This was intentional, I added a note in the docs about it:
>>
>> When <function>RegisterShmemCallbacks()</function> is called after
>> startup, it will immediately call the appropriate callbacks,
>> depending
>> on whether the requested memory areas were already initialized by
>> another backend. The callbacks will be called while holding an
>> internal
>> lock, which prevents concurrent two backends from initializing the
>> memory area concurrently.
>>
>> That "internal lock" is ShmemIndexLock. I piggybacked on that since the
>> code needs to acquire it anyway for the hash table lookups.
>
> I had read this part, but didn't realize it's ShmemIndexLock. The
> document and the code are placed far apart and the comments in the
> code do not help connecting these two. The comment before
> LWLockAcquire() call doesn't say anything about init functions.
> /* Hold ShmemIndexLock while we allocate all the shmem entries */
>
>> With the old ShmemInitStruct() interface, extensions needed to do the
>> locking themselves, usually by holding AddinShmemInitLock.
>>
>> (Now that I read that again, the grammar on the last sentence sounds
>> awkward...)
>
> Given that the init_fn is called in only one backend which requests
> the structures first, do we need a lock?
If two backends request the same structure concurrently, which one is
"first"? That's what the lock determines.
It's not safe to release the lock before the init callback has finished.
Otherwise, another backend might attach to the struct before it's fully
initialized and read uninitialized values.
>>> In my case, the init_fn was performing ShmemIndex lookup which
>>> deadlocked. It's questionable whether init function should lookup
>>> ShmemIndex but, it's not something that needs to be prohibited
>>> either.
>> Yeah I'm curious what the use case is. We could easily introduce another
>> lock or reuse AddinShmemInitLock for this.
>
> In case of resizable shared memory structures, I was adding mprotect
> to make sure that the part of the shared address space which is
> reserved but not used is protected from inadvertent access. The
> mprotect is wrapped in a shmem API which fetches the ShmemIndex entry
> of the shared structure, figures out the part of the address space to
> protect using maximum_size and current size and calls mprotect
> appropriately. To fetch the ShmemIndex entry it acquires a ShmemIndex
> lock. The shmem API was supposed to be called from init_fn() and
> attach_fn() to protect the address spaces as soon as the structure is
> attached to. See patches attached to [1] for code.
>
> [1] https://www.postgresql.org/message-id/[email protected]...
Ok. So if I understand correctly, holding ShmemIndexLock is not a actual
problem per se, you just didn't expect it. Right?
I propose the attached to improve the wording a little on the docs,
comments, and error message.
- Heikki
Attachments:
[text/x-patch] improve-shmem-init-callback-locking-docs.patch (1.6K, 2-improve-shmem-init-callback-locking-docs.patch)
download | inline diff:
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index bae16d7fb53..cb3cc09f16d 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3739,8 +3739,8 @@ my_shmem_init(void *arg)
startup, it will immediately call the appropriate callbacks, depending
on whether the requested memory areas were already initialized by
another backend. The callbacks will be called while holding an internal
- lock, which prevents concurrent two backends from initializing the
- memory area concurrently.
+ lock (ShmemIndexLock), which prevents the race condition of two backends
+ trying to initializing the memory area at the same time.
</para>
</sect3>
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index f1f7cd3a4ff..1fbba9c3a4c 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -918,7 +918,10 @@ CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
return;
}
- /* Hold ShmemIndexLock while we allocate all the shmem entries */
+ /*
+ * Hold ShmemIndexLock while we allocate all the shmem entries and run all
+ * the initializers.
+ */
LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
/*
@@ -937,7 +940,7 @@ CallShmemCallbacksAfterStartup(const ShmemCallbacks *callbacks)
notfound_any = true;
}
if (found_any && notfound_any)
- elog(ERROR, "found some but not all");
+ elog(ERROR, "some of the requested shmem areas have already been initialized");
/*
* Allocate or attach all the shmem areas requested by the request_fn
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-06-12 15:51 Haoyu Huang <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Haoyu Huang @ 2026-06-12 15:51 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Ashutosh Bapat <[email protected]>; Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
Hi all,
Wanted to introduce myself on this thread and share some related work —
with no intention of forking or redirecting what Ashutosh is driving here.
It was great catching up with Ashutosh, David Wein, and Heikki at PGConf
Vancouver. We had a working session on resizable shared buffers. It was
productive and a lot of fun. The outcome of the session is to surface our
work at Databricks on the same topic here.
At Databricks, we have a patch merged in our internal Postgres that enables
resizing shared_buffers without restart. It was inspired by Ashutosh's
earlier patch on this topic. Heikki and David reviewed it on our side. I'd
like to contribute the ideas (and, where useful, the code) back upstream.
I want to be explicit that the current series is the path forward. I'd much
rather plug into that than propose a competing patchset. Happy to help with
review, testing, or specific pieces wherever it's most useful.
Please read the patch as input to the existing effort, not a
counter-proposal. Thanks for all the work on this so far.
Here are the major changes we made on top of Ashutosh's earlier patch
1. Keep one mmap anonymous segment + madvise(MADV_POPULATE_WRITE /
MADV_REMOVE) to allocate/free physical pages.
2. SB variable names changed to use lowNBuffers, highNBuffers, and
maxNBuffers. See the README for more details. We think that this simplifies
the code significantly.
3. Only the shrink needs to use proc signal barrier to coordinate with
all other backends. The other cases are covered by a new
AccessNBuffersLock. The coordinator acquires exclusive lock on
AccessNBuffersLock when it publishes new buffers. Other backends acquire
the shared lock on AccessNBuffersLock when they loop through the buffer
array based on the NBuffers value.
4. The API to resize the shared buffer is `SELECT
pg_resize_shared_buffers('new_size')`.
Thanks,
Haoyu
On Fri, Jun 12, 2026 at 8:37 AM Heikki Linnakangas <[email protected]> wrote:
> On 21/04/2026 19:05, Ashutosh Bapat wrote:
> > On Tue, Apr 21, 2026 at 1:10 PM Heikki Linnakangas <[email protected]>
> wrote:
> >>
> >> On 07/04/2026 17:19, Ashutosh Bapat wrote:
> >>> Hi Heikki,
> >>> CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
> >>> init_fn/attach_fn callbacks. That looks wrong. Before this commit,
> >>> init or attach code was not run with the lock held. Any reason the
> >>> lock is held while calling init and attach callbacks. Since these
> >>> function can come from extensions, we don't have control on what goes
> >>> in those functions, and thus looks problematic. Further, it will
> >>> serialize all the attach_fn executions across backends, since each
> >>> will be run under the lock.
> >>
> >> This was intentional, I added a note in the docs about it:
> >>
> >> When <function>RegisterShmemCallbacks()</function> is called
> after
> >> startup, it will immediately call the appropriate callbacks,
> >> depending
> >> on whether the requested memory areas were already initialized
> by
> >> another backend. The callbacks will be called while holding an
> >> internal
> >> lock, which prevents concurrent two backends from initializing
> the
> >> memory area concurrently.
> >>
> >> That "internal lock" is ShmemIndexLock. I piggybacked on that since the
> >> code needs to acquire it anyway for the hash table lookups.
> >
> > I had read this part, but didn't realize it's ShmemIndexLock. The
> > document and the code are placed far apart and the comments in the
> > code do not help connecting these two. The comment before
> > LWLockAcquire() call doesn't say anything about init functions.
> > /* Hold ShmemIndexLock while we allocate all the shmem entries */
> >
> >> With the old ShmemInitStruct() interface, extensions needed to do the
> >> locking themselves, usually by holding AddinShmemInitLock.
> >>
> >> (Now that I read that again, the grammar on the last sentence sounds
> >> awkward...)
> >
> > Given that the init_fn is called in only one backend which requests
> > the structures first, do we need a lock?
>
> If two backends request the same structure concurrently, which one is
> "first"? That's what the lock determines.
>
> It's not safe to release the lock before the init callback has finished.
> Otherwise, another backend might attach to the struct before it's fully
> initialized and read uninitialized values.
>
> >>> In my case, the init_fn was performing ShmemIndex lookup which
> >>> deadlocked. It's questionable whether init function should lookup
> >>> ShmemIndex but, it's not something that needs to be prohibited
> >>> either.
> >> Yeah I'm curious what the use case is. We could easily introduce another
> >> lock or reuse AddinShmemInitLock for this.
> >
> > In case of resizable shared memory structures, I was adding mprotect
> > to make sure that the part of the shared address space which is
> > reserved but not used is protected from inadvertent access. The
> > mprotect is wrapped in a shmem API which fetches the ShmemIndex entry
> > of the shared structure, figures out the part of the address space to
> > protect using maximum_size and current size and calls mprotect
> > appropriately. To fetch the ShmemIndex entry it acquires a ShmemIndex
> > lock. The shmem API was supposed to be called from init_fn() and
> > attach_fn() to protect the address spaces as soon as the structure is
> > attached to. See patches attached to [1] for code.
> >
> > [1]
> https://www.postgresql.org/message-id/[email protected]...
>
> Ok. So if I understand correctly, holding ShmemIndexLock is not a actual
> problem per se, you just didn't expect it. Right?
>
> I propose the attached to improve the wording a little on the docs,
> comments, and error message.
>
> - Heikki
>
Attachments:
[application/octet-stream] 0001-Online-resize-of-the-shared-buffer-pool.patch (101.7K, 3-0001-Online-resize-of-the-shared-buffer-pool.patch)
download | inline diff:
From 5f439f6047df8ae9a94b27b5a94c73699ea15cfb Mon Sep 17 00:00:00 2001
From: Haoyu Huang <[email protected]>
Date: Wed, 20 May 2026 21:37:55 +0000
Subject: [PATCH] Online resize of the shared buffer pool
This patch adds pg_resize_shared_buffers(), a SQL-callable coordinator
that resizes the shared buffer pool while the server is running. The
work is composed of two halves:
1. Buffer-manager infrastructure: a two-water-mark scheme
(lowNBuffers/highNBuffers) protected by AccessNBuffersLock, a new
PROCSIGNAL_BARRIER_SHBUF_RESIZE barrier, the BEGIN/END_NBUFFERS_ACCESS
macros for safe iteration over the buffer array, and three new
primitives -- BufferManagerShmemShrink/Expand/InitBuffers -- that
either madvise(MADV_REMOVE) memory away or madvise(MADV_POPULATE_WRITE)
it back in. These changes live across buf_init.c, bufmgr.c, freelist.c
and their headers.
2. The coordinator (buf_resize.c) implements the SQL-callable function
pg_resize_shared_buffers(text) returning a (key, value, unit) record
set of timing/byte metrics. Shrink lowers lowNBuffers, runs the
PROCSIGNAL barrier, evicts the doomed range, then frees memory and
advances highNBuffers. Expand allocates and initializes new
descriptors before atomically advancing both water marks under
AccessNBuffersLock.
Two new GUCs (defined in guc_parameters.dat):
* max_shared_buffers (PGC_POSTMASTER): upper bound on highNBuffers,
sized once at startup. NBuffersGUC backs the existing
shared_buffers GUC and captures the starting pool size.
* enable_dynamic_shared_buffers (PGC_POSTMASTER): off by default; when
off, all of the new code paths are no-ops and the server behaves as
before.
Patch rebased onto upstream master from the v18-based development
branch.
---
contrib/pg_buffercache/pg_buffercache_pages.c | 158 +++--
contrib/pg_prewarm/autoprewarm.c | 6 +-
src/backend/access/hash/hash.c | 2 +-
src/backend/access/heap/heapam.c | 2 +-
src/backend/access/table/tableam.c | 2 +-
src/backend/access/transam/slru.c | 2 +-
src/backend/access/transam/xlog.c | 11 +-
src/backend/postmaster/checkpointer.c | 84 ++-
src/backend/storage/aio/aio_init.c | 2 +-
src/backend/storage/buffer/Makefile | 2 +
src/backend/storage/buffer/buf_init.c | 457 +++++++++++++--
src/backend/storage/buffer/buf_resize.c | 544 ++++++++++++++++++
src/backend/storage/buffer/bufmgr.c | 210 ++++++-
.../storage/buffer/dynamic_shared_buffers.c | 125 ++++
src/backend/storage/buffer/freelist.c | 71 ++-
src/backend/storage/buffer/meson.build | 2 +
src/backend/storage/ipc/procsignal.c | 17 +
.../utils/activity/wait_event_names.txt | 1 +
src/backend/utils/init/globals.c | 4 +-
src/backend/utils/misc/guc.c | 8 +
src/backend/utils/misc/guc_parameters.dat | 20 +-
src/include/catalog/pg_proc.dat | 9 +
src/include/miscadmin.h | 18 +-
src/include/storage/buf_internals.h | 5 +-
src/include/storage/bufmgr.h | 13 +-
src/include/storage/dynamic_shared_buffers.h | 103 ++++
src/include/storage/ipc.h | 41 ++
src/include/storage/lwlocklist.h | 1 +
src/include/storage/procsignal.h | 2 +
src/test/regress/expected/sysviews.out | 2 +-
src/test/regress/sql/sysviews.sql | 2 +-
src/tools/pgindent/typedefs.list | 1 +
32 files changed, 1763 insertions(+), 164 deletions(-)
create mode 100644 src/backend/storage/buffer/buf_resize.c
create mode 100644 src/backend/storage/buffer/dynamic_shared_buffers.c
create mode 100644 src/include/storage/dynamic_shared_buffers.h
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index bf2e6c97220..905f14e0a04 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -15,6 +15,7 @@
#include "port/pg_numa.h"
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
+#include "storage/ipc.h"
#include "utils/rel.h"
#include "utils/tuplestore.h"
@@ -88,16 +89,84 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
TupleDesc expected_tupledesc;
int i;
- /*
- * To smoothly support upgrades from version 1.0 of this extension
- * transparently handle the (non-)existence of the pinning_backends
- * column. We unfortunately have to get the result type for that... - we
- * can't use the result type determined by the function definition without
- * potentially crashing when somebody uses the old (or even wrong)
- * function definition though.
- */
- if (get_call_result_type(fcinfo, NULL, &expected_tupledesc) != TYPEFUNC_COMPOSITE)
- elog(ERROR, "return type must be a row type");
+ if (SRF_IS_FIRSTCALL())
+ {
+ int i;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* Switch context when allocating stuff to be used in later calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ /* Create a user function context for cross-call persistence */
+ fctx = (BufferCachePagesContext *) palloc(sizeof(BufferCachePagesContext));
+
+ /*
+ * To smoothly support upgrades from version 1.0 of this extension
+ * transparently handle the (non-)existence of the pinning_backends
+ * column. We unfortunately have to get the result type for that... -
+ * we can't use the result type determined by the function definition
+ * without potentially crashing when somebody uses the old (or even
+ * wrong) function definition though.
+ */
+ if (get_call_result_type(fcinfo, NULL, &expected_tupledesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ if (expected_tupledesc->natts < NUM_BUFFERCACHE_PAGES_MIN_ELEM ||
+ expected_tupledesc->natts > NUM_BUFFERCACHE_PAGES_ELEM)
+ elog(ERROR, "incorrect number of output arguments");
+
+ /* Construct a tuple descriptor for the result rows. */
+ tupledesc = CreateTemplateTupleDesc(expected_tupledesc->natts);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 1, "bufferid",
+ INT4OID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 5, "relforknumber",
+ INT2OID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 6, "relblocknumber",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 7, "isdirty",
+ BOOLOID, -1, 0);
+ TupleDescInitEntry(tupledesc, (AttrNumber) 8, "usage_count",
+ INT2OID, -1, 0);
+
+ if (expected_tupledesc->natts == NUM_BUFFERCACHE_PAGES_ELEM)
+ TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
+ INT4OID, -1, 0);
+
+ fctx->tupdesc = BlessTupleDesc(tupledesc);
+
+
+ /* Allocate NBuffers worth of BufferCachePagesRec records. */
+ fctx->record = (BufferCachePagesRec *)
+ MemoryContextAllocHuge(CurrentMemoryContext,
+ sizeof(BufferCachePagesRec) * localNBuffers);
+
+ /* Set max calls and remember the user function context. */
+ funcctx->max_calls = localNBuffers;
+ funcctx->user_fctx = fctx;
+
+ /* Return to original context when allocating transient memory */
+ MemoryContextSwitchTo(oldcontext);
+
+ /*
+ * Scan through all the buffers, saving the relevant fields in the
+ * fctx->record structure.
+ *
+ * We don't hold the partition locks, so we don't get a consistent
+ * snapshot across all buffers, but we do grab the buffer header
+ * locks, so the information of each buffer is self-consistent.
+ */
+ for (i = 0; i < localNBuffers; i++)
+ {
+ BufferDesc *bufHdr;
+ uint32 buf_state;
if (expected_tupledesc->natts < NUM_BUFFERCACHE_PAGES_MIN_ELEM ||
expected_tupledesc->natts > NUM_BUFFERCACHE_PAGES_ELEM)
@@ -132,18 +201,10 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
- bufHdr = GetBufferDescriptor(i);
- /* Lock each buffer header before inspecting. */
- buf_state = LockBufHdr(bufHdr);
-
- bufferid = BufferDescriptorGetBuffer(bufHdr);
- relfilenumber = BufTagGetRelNumber(&bufHdr->tag);
- reltablespace = bufHdr->tag.spcOid;
- reldatabase = bufHdr->tag.dbOid;
- forknum = BufTagGetForkNum(&bufHdr->tag);
- blocknum = bufHdr->tag.blockNum;
- usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
- pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
+ UnlockBufHdr(bufHdr, buf_state);
+ }
+ END_NBUFFERS_ACCESS(localNBuffers);
+ }
if (buf_state & BM_DIRTY)
isdirty = true;
@@ -248,6 +309,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
int max_entries;
char *startptr,
*endptr;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
/* If NUMA information is requested, initialize NUMA support. */
if (include_numa && pg_numa_init() == -1)
@@ -278,7 +340,24 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
*/
Assert((os_page_size % BLCKSZ == 0) || (BLCKSZ % os_page_size == 0));
- if (include_numa)
+ /*
+ * How many addresses we are going to query? Simply get the page for
+ * the first buffer, and first page after the last buffer, and count
+ * the pages from that.
+ */
+ startptr = (char *) TYPEALIGN_DOWN(os_page_size,
+ BufferGetBlock(1));
+ endptr = (char *) TYPEALIGN(os_page_size,
+ (char *) BufferGetBlock(localNBuffers) + BLCKSZ);
+ os_page_count = (endptr - startptr) / os_page_size;
+
+ /* Used to determine the NUMA node for all OS pages at once */
+ os_page_ptrs = palloc0(sizeof(void *) * os_page_count);
+ os_page_status = palloc(sizeof(int) * os_page_count);
+
+ /* Fill pointers for all the memory pages. */
+ idx = 0;
+ for (char *ptr = startptr; ptr < endptr; ptr += os_page_size)
{
void **os_page_ptrs = NULL;
@@ -315,8 +394,8 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
Assert(idx == os_page_count);
- elog(DEBUG1, "NUMA: NBuffers=%d os_page_count=" UINT64_FORMAT " "
- "os_page_size=%zu", NBuffers, os_page_count, os_page_size);
+ elog(DEBUG1, "NUMA: NBuffers=%d os_page_count=" UINT64_FORMAT " "
+ "os_page_size=%zu", localNBuffers, os_page_count, os_page_size);
/*
* If we ever get 0xff back from kernel inquiry, then we probably
@@ -366,7 +445,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
* without reallocating memory.
*/
pages_per_buffer = Max(1, BLCKSZ / os_page_size) + 1;
- max_entries = NBuffers * pages_per_buffer;
+ max_entries = localNBuffers * pages_per_buffer;
/* Allocate entries for BufferCacheOsPagesRec records. */
fctx->record = (BufferCacheOsPagesRec *)
@@ -386,10 +465,14 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
* We don't hold the partition locks, so we don't get a consistent
* snapshot across all buffers, but we do grab the buffer header
* locks, so the information of each buffer is self-consistent.
+ *
+ * This loop touches and stores addresses into os_page_ptrs[] as input
+ * to one big move_pages(2) inquiry system call. Basically we ask for
+ * all memory pages for localNBuffers.
*/
startptr = (char *) TYPEALIGN_DOWN(os_page_size, (char *) BufferGetBlock(1));
idx = 0;
- for (i = 0; i < NBuffers; i++)
+ for (i = 0; i < localNBuffers; i++)
{
char *buffptr = (char *) BufferGetBlock(i + 1);
BufferDesc *bufHdr;
@@ -440,9 +523,10 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
funcctx->max_calls = idx;
funcctx->user_fctx = fctx;
- /* Remember this backend touched the pages (only relevant for NUMA) */
- if (include_numa)
- firstNumaTouch = false;
+ /* Remember this backend touched the pages */
+ firstNumaTouch = false;
+
+ END_NBUFFERS_ACCESS(localNBuffers);
}
funcctx = SRF_PERCALL_SETUP();
@@ -531,11 +615,12 @@ pg_buffercache_summary(PG_FUNCTION_ARGS)
int32 buffers_dirty = 0;
int32 buffers_pinned = 0;
int64 usagecount_total = 0;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
if (get_call_result_type(fcinfo, NULL, &tupledesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
- for (int i = 0; i < NBuffers; i++)
+ for (int i = 0; i < localNBuffers; i++)
{
BufferDesc *bufHdr;
uint64 buf_state;
@@ -565,6 +650,7 @@ pg_buffercache_summary(PG_FUNCTION_ARGS)
if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
buffers_pinned++;
}
+ END_NBUFFERS_ACCESS(localNBuffers);
memset(nulls, 0, sizeof(nulls));
values[0] = Int32GetDatum(buffers_used);
@@ -593,10 +679,11 @@ pg_buffercache_usage_counts(PG_FUNCTION_ARGS)
int pinned[BM_MAX_USAGE_COUNT + 1] = {0};
Datum values[NUM_BUFFERCACHE_USAGE_COUNTS_ELEM];
bool nulls[NUM_BUFFERCACHE_USAGE_COUNTS_ELEM] = {0};
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
InitMaterializedSRF(fcinfo, 0);
- for (int i = 0; i < NBuffers; i++)
+ for (int i = 0; i < localNBuffers; i++)
{
BufferDesc *bufHdr = GetBufferDescriptor(i);
uint64 buf_state = pg_atomic_read_u64(&bufHdr->state);
@@ -613,6 +700,7 @@ pg_buffercache_usage_counts(PG_FUNCTION_ARGS)
if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
pinned[usage_count]++;
}
+ END_NBUFFERS_ACCESS(localNBuffers);
for (int i = 0; i < BM_MAX_USAGE_COUNT + 1; i++)
{
@@ -654,13 +742,15 @@ pg_buffercache_evict(PG_FUNCTION_ARGS)
Buffer buf = PG_GETARG_INT32(0);
bool buffer_flushed;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+ (void) localNBuffers;
if (get_call_result_type(fcinfo, NULL, &tupledesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
pg_buffercache_superuser_check("pg_buffercache_evict");
- if (buf < 1 || buf > NBuffers)
+ if (buf < 1 || buf > GetLowNBuffers())
elog(ERROR, "bad buffer ID: %d", buf);
values[0] = BoolGetDatum(EvictUnpinnedBuffer(buf, &buffer_flushed));
@@ -669,6 +759,8 @@ pg_buffercache_evict(PG_FUNCTION_ARGS)
tuple = heap_form_tuple(tupledesc, values, nulls);
result = HeapTupleGetDatum(tuple);
+ END_NBUFFERS_ACCESS(localNBuffers);
+
PG_RETURN_DATUM(result);
}
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index ba0bc8e6d4a..b620c053dc6 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -672,6 +672,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
FILE *file;
char transient_dump_file_path[MAXPGPATH];
pid_t pid;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
LWLockAcquire(&apw_state->lock, LW_EXCLUSIVE);
pid = apw_state->pid_using_dumpfile;
@@ -700,9 +701,9 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
* memory-efficient data structure.)
*/
block_info_array = (BlockInfoRecord *)
- palloc_extended((sizeof(BlockInfoRecord) * NBuffers), MCXT_ALLOC_HUGE);
+ palloc_extended((sizeof(BlockInfoRecord) * localNBuffers), MCXT_ALLOC_HUGE);
- for (num_blocks = 0, i = 0; i < NBuffers; i++)
+ for (num_blocks = 0, i = 0; i < localNBuffers; i++)
{
uint64 buf_state;
@@ -733,6 +734,7 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
snprintf(transient_dump_file_path, MAXPGPATH, "%s.tmp", AUTOPREWARM_FILE);
file = AllocateFile(transient_dump_file_path, "w");
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 8d8cd30dc38..e43eca2bce1 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -176,7 +176,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
*/
sort_threshold = (maintenance_work_mem * (Size) 1024) / BLCKSZ;
if (index->rd_rel->relpersistence != RELPERSISTENCE_TEMP)
- sort_threshold = Min(sort_threshold, NBuffers);
+ sort_threshold = Min(sort_threshold, GetHighNBuffers());
else
sort_threshold = Min(sort_threshold, NLocBuffer);
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index abfd8e8970a..4d27d2b1817 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -394,7 +394,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
* if you change this, consider changing that one, too.
*/
if (!RelationUsesLocalBuffers(scan->rs_base.rs_rd) &&
- scan->rs_nblocks > NBuffers / 4)
+ scan->rs_nblocks > GetHighNBuffers() / 4)
{
allow_strat = (scan->rs_base.rs_flags & SO_ALLOW_STRAT) != 0;
allow_sync = (scan->rs_base.rs_flags & SO_ALLOW_SYNC) != 0;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index 68ff0966f1c..6afa9176174 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -420,7 +420,7 @@ table_block_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan)
/* compare phs_syncscan initialization to similar logic in initscan */
bpscan->base.phs_syncscan = synchronize_seqscans &&
!RelationUsesLocalBuffers(rel) &&
- bpscan->phs_nblocks > NBuffers / 4;
+ bpscan->phs_nblocks > GetHighNBuffers() / 4;
SpinLockInit(&bpscan->phs_mutex);
bpscan->phs_startblock = InvalidBlockNumber;
bpscan->phs_numblock = InvalidBlockNumber;
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 47dd52d6749..a1fc6a4f7a0 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -236,7 +236,7 @@ SimpleLruAutotuneBuffers(int divisor, int max)
{
return Min(max - (max % SLRU_BANK_SIZE),
Max(SLRU_BANK_SIZE,
- NBuffers / divisor - (NBuffers / divisor) % SLRU_BANK_SIZE));
+ GetMaxNBuffers() / divisor - (GetMaxNBuffers() / divisor) % SLRU_BANK_SIZE));
}
/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d34e34a56c5..cc67b27a780 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5026,7 +5026,10 @@ XLOGChooseNumBuffers(void)
{
int xbuffers;
- xbuffers = NBuffers / 32;
+ /*
+ * Use the maximum buffer pool size.
+ */
+ xbuffers = GetMaxNBuffers() / 32;
if (xbuffers > (wal_segment_size / XLOG_BLCKSZ))
xbuffers = (wal_segment_size / XLOG_BLCKSZ);
if (xbuffers < 8)
@@ -7242,7 +7245,7 @@ LogCheckpointEnd(bool restartpoint, int flags)
"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
CheckpointFlagsString(flags),
CheckpointStats.ckpt_bufs_written,
- (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
+ (double) CheckpointStats.ckpt_bufs_written * 100 / GetHighNBuffers(),
CheckpointStats.ckpt_slru_written,
CheckpointStats.ckpt_segs_added,
CheckpointStats.ckpt_segs_removed,
@@ -7267,7 +7270,7 @@ LogCheckpointEnd(bool restartpoint, int flags)
"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
CheckpointFlagsString(flags),
CheckpointStats.ckpt_bufs_written,
- (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
+ (double) CheckpointStats.ckpt_bufs_written * 100 / GetHighNBuffers(),
CheckpointStats.ckpt_slru_written,
CheckpointStats.ckpt_segs_added,
CheckpointStats.ckpt_segs_removed,
@@ -7885,7 +7888,7 @@ CreateCheckPoint(int flags)
update_checkpoint_display(flags, false, true);
TRACE_POSTGRESQL_CHECKPOINT_DONE(CheckpointStats.ckpt_bufs_written,
- NBuffers,
+ GetHighNBuffers(),
CheckpointStats.ckpt_segs_added,
CheckpointStats.ckpt_segs_removed,
CheckpointStats.ckpt_segs_recycled);
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 087120db090..bbbc6c2f95d 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -161,6 +161,15 @@ const ShmemCallbacks CheckpointerShmemCallbacks = {
/* Max number of requests the checkpointer request queue can hold */
#define MAX_CHECKPOINT_REQUESTS 10000000
+/*
+ * Queue size used under dynamic_shared_buffers. Local-file fsyncs are
+ * bypassed in ForwardSyncRequest under DSB (Neon's durability is the WAL
+ * stream), so the queue does not need to scale with the buffer pool. But
+ * we still need a real queue so SYNC_UNLINK_REQUEST is not silently
+ * dropped.
+ */
+#define DSB_CHECKPOINT_REQUESTS 4096
+
/*
* GUC parameters
*/
@@ -967,19 +976,29 @@ CheckpointerShmemRequest(void *arg)
{
Size size;
+ size = offsetof(CheckpointerShmemStruct, requests);
+
/*
- * The size of the requests[] array is arbitrarily set equal to NBuffers.
- * But there is a cap of MAX_CHECKPOINT_REQUESTS to prevent accumulating
- * too many checkpoint requests in the ring buffer.
+ * The size of the requests[] array is arbitrarily set equal to the
+ * initial size of buffer pool. But there is a cap of
+ * MAX_CHECKPOINT_REQUESTS to prevent accumulating too many checkpoint
+ * requests in the ring buffer.
+ *
+ * Under dynamic_shared_buffers we use a small fixed cap instead --
+ * sizing the queue on MaxNBuffers would waste a lot of shmem under
+ * auto-scale, but a real (non-zero) queue is still required so that
+ * SYNC_UNLINK_REQUEST can be forwarded to the checkpointer for delayed
+ * unlink processing.
*/
- size = offsetof(CheckpointerShmemStruct, requests);
- size = add_size(size, mul_size(Min(NBuffers,
- MAX_CHECKPOINT_REQUESTS),
- sizeof(CheckpointerRequest)));
- ShmemRequestStruct(.name = "Checkpointer Data",
- .size = size,
- .ptr = (void **) &CheckpointerShmem,
- );
+ if (enable_dynamic_shared_buffers)
+ size = add_size(size, mul_size(DSB_CHECKPOINT_REQUESTS,
+ sizeof(CheckpointerRequest)));
+ else
+ size = add_size(size, mul_size(Min(NBuffersGUC,
+ MAX_CHECKPOINT_REQUESTS),
+ sizeof(CheckpointerRequest)));
+
+ return size;
}
/*
@@ -1010,27 +1029,21 @@ ExecCheckpoint(ParseState *pstate, CheckPointStmt *stmt)
foreach_ptr(DefElem, opt, stmt->options)
{
- if (strcmp(opt->defname, "mode") == 0)
- {
- char *mode = defGetString(opt);
-
- if (strcmp(mode, "spread") == 0)
- fast = false;
- else if (strcmp(mode, "fast") != 0)
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("unrecognized value for %s option \"%s\": \"%s\"",
- "CHECKPOINT", "mode", mode),
- parser_errposition(pstate, opt->location)));
- }
- else if (strcmp(opt->defname, "flush_unlogged") == 0)
- unlogged = defGetBoolean(opt);
+ /*
+ * First time through, so initialize. Note that we zero the whole
+ * requests array; this is so that CompactCheckpointerRequestQueue can
+ * assume that any pad bytes in the request structs are zeroes.
+ */
+ MemSet(CheckpointerShmem, 0, size);
+ SpinLockInit(&CheckpointerShmem->ckpt_lck);
+
+ if (enable_dynamic_shared_buffers)
+ CheckpointerShmem->max_requests = DSB_CHECKPOINT_REQUESTS;
else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("unrecognized %s option \"%s\"",
- "CHECKPOINT", opt->defname),
- parser_errposition(pstate, opt->location)));
+ CheckpointerShmem->max_requests = Min(NBuffersGUC,
+ MAX_CHECKPOINT_REQUESTS);
+ ConditionVariableInit(&CheckpointerShmem->start_cv);
+ ConditionVariableInit(&CheckpointerShmem->done_cv);
}
if (!has_privs_of_role(GetUserId(), ROLE_PG_CHECKPOINT))
@@ -1228,6 +1241,15 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)
if (AmCheckpointerProcess())
elog(ERROR, "ForwardSyncRequest must not be called in checkpointer");
+ /*
+ * Queue unlinks and let the checkpointer drain them.
+ *
+ * Neon durability is provided by the WAL stream.
+ * SYNC_FORGET_REQUEST/SYNC_FILTER_REQUEST/SYNC_REQUEST can be dropped.
+ */
+ if (enable_dynamic_shared_buffers && type != SYNC_UNLINK_REQUEST)
+ return true;
+
LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);
/*
diff --git a/src/backend/storage/aio/aio_init.c b/src/backend/storage/aio/aio_init.c
index da30d792a88..c6b4bf975e3 100644
--- a/src/backend/storage/aio/aio_init.c
+++ b/src/backend/storage/aio/aio_init.c
@@ -109,7 +109,7 @@ AioChooseMaxConcurrency(void)
/* Similar logic to LimitAdditionalPins() */
max_backends = MaxBackends + NUM_AUXILIARY_PROCS;
- max_proportional_pins = NBuffers / max_backends;
+ max_proportional_pins = GetMaxNBuffers() / max_backends;
max_proportional_pins = Max(max_proportional_pins, 1);
diff --git a/src/backend/storage/buffer/Makefile b/src/backend/storage/buffer/Makefile
index fd7c40dcb08..2567f32e131 100644
--- a/src/backend/storage/buffer/Makefile
+++ b/src/backend/storage/buffer/Makefile
@@ -14,8 +14,10 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
buf_init.o \
+ buf_resize.o \
buf_table.o \
bufmgr.o \
+ dynamic_shared_buffers.o \
freelist.o \
localbuf.o
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 1407c930c56..757c00d03d6 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -14,12 +14,18 @@
*/
#include "postgres.h"
+#include <unistd.h>
+#ifdef __linux__
+#include <sys/mman.h>
+#endif
+
+#include "miscadmin.h"
#include "storage/aio.h"
#include "storage/buf_internals.h"
#include "storage/bufmgr.h"
-#include "storage/proclist.h"
+#include "storage/pg_shmem.h"
#include "storage/shmem.h"
-#include "storage/subsystems.h"
+#include "utils/memdebug.h"
BufferDescPadded *BufferDescriptors;
char *BufferBlocks;
@@ -69,6 +75,208 @@ const ShmemCallbacks BufferManagerShmemCallbacks = {
* multiple times. Check the PrivateRefCount infrastructure in bufmgr.c.
*/
+/*
+ * Initialize a single buffer descriptor.
+ *
+ * Buffers are exclusively found via clock sweep (the freelist was removed
+ * in commit 2c789405275). This function is called both from
+ * BufferManagerShmemInit at boot and from BufferManagerShmemInitBuffers
+ * during an online expand.
+ */
+static void
+InitializeBuffer(int buf_id)
+{
+ BufferDesc *buf = GetBufferDescriptor(buf_id);
+
+ ClearBufferTag(&buf->tag);
+ pg_atomic_init_u32(&buf->state, 0);
+ buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ buf->buf_id = buf_id;
+ pgaio_wref_clear(&buf->io_wref);
+
+ LWLockInitialize(BufferDescriptorGetContentLock(buf),
+ LWTRANCHE_BUFFER_CONTENT);
+
+ ConditionVariableInit(BufferDescriptorGetIOCV(buf));
+}
+
+/*
+ * Page size used both to lay out the buffer-pool arrays in shared memory and
+ * to align the per-slice madvise() ranges issued during expand/shrink.
+ */
+static Size
+buffer_pool_madvise_alignment(void)
+{
+#ifdef __linux__
+ if (huge_pages == HUGE_PAGES_ON)
+ {
+ Size hugepagesize = 0;
+
+ GetHugePageSize(&hugepagesize, NULL);
+ if (hugepagesize > 0)
+ return hugepagesize;
+ /* Conservative fallback if /proc/meminfo lookup failed. */
+ return (Size) 2 * 1024 * 1024;
+ }
+#endif
+ return (Size) sysconf(_SC_PAGESIZE);
+}
+
+/*
+ * Return the exact byte length of the expanded range [lowNBuffers,
+ * highNBuffers) that this call touches (memset / madvise), or 0 when a no-op
+ * or on platforms where the Linux path is not used. On
+ * madvise(MADV_POPULATE_WRITE) failure, *success is set to false and 0 is
+ * returned; the new range is not guaranteed to be backed by physical memory,
+ * so callers should stop expanding rather than continue and risk a SIGBUS on
+ * first touch.
+ */
+static Size
+BufferPoolArrayPhysicalExpand(void *baseptr, Size elem_size,
+ int lowNBuffers, int highNBuffers,
+ bool *success)
+{
+#ifdef __linux__
+ char *base;
+ Size off;
+ Size len;
+ uintptr_t region_start;
+ uintptr_t region_end;
+ uintptr_t ms;
+ uintptr_t me;
+ Size os_page_size = buffer_pool_madvise_alignment();
+#endif
+
+ if (baseptr == NULL || elem_size == 0 || highNBuffers <= lowNBuffers)
+ return 0;
+
+#ifdef __linux__
+ base = (char *) baseptr;
+ Assert(os_page_size != 0);
+
+ off = mul_size((Size) lowNBuffers, elem_size);
+ len = mul_size((Size) (highNBuffers - lowNBuffers), elem_size);
+
+ region_start = (uintptr_t) (base + off);
+ region_end = region_start + len;
+
+ ms = TYPEALIGN_DOWN((Size) os_page_size, region_start);
+ me = TYPEALIGN((Size) os_page_size, region_end);
+
+#ifdef USE_VALGRIND
+ VALGRIND_MAKE_MEM_DEFINED((void *) region_start, len);
+#endif
+
+#if defined(MADV_HUGEPAGE) && defined(MADV_POPULATE_WRITE)
+#ifdef USE_ASSERT_CHECKING
+ if (mprotect((void *) ms, me - ms, PROT_READ | PROT_WRITE) < 0 && errno != ENOMEM)
+ elog(WARNING, "mprotect(PROT_READ|PROT_WRITE) before buffer pool expand: %m");
+#endif
+ /*
+ * If huge pages is on, MADV_HUGEPAGE advice will fail.
+ */
+ if (huge_pages_status != HUGE_PAGES_ON &&
+ madvise((void *) ms, me - ms, MADV_HUGEPAGE) < 0)
+ elog(WARNING, "madvise(MADV_HUGEPAGE) on expanded buffer pool array: %m");
+
+ if (madvise((void *) ms, me - ms, MADV_POPULATE_WRITE) < 0)
+ {
+ elog(WARNING, "madvise(MADV_POPULATE_WRITE) on expanded buffer pool array: %m");
+ *success = false;
+ return 0;
+ }
+#else
+ /*
+ * No MADV_POPULATE_WRITE on this platform: memset is the only way to
+ * force population. memset can't return a failure, so this path always
+ * "succeeds"; if the underlying mapping is actually unbacked the SIGBUS
+ * will hit during the memset itself.
+ */
+ memset((void *) region_start, 0, len);
+#endif
+
+ return len;
+#else
+ return 0; /* no local physical work off Linux */
+#endif
+}
+
+/*
+ * Return the exact byte length passed to a successful MADV_REMOVE, or 0 if no
+ * page-aligned run was freed (no-op case). On madvise() failure, *success is
+ * set to false and 0 is returned.
+ *
+ * The released slice is [lowNBuffers, highNBuffers); we trim physical storage
+ * for the entire inactive tail [lowNBuffers, MaxNBuffers) so that incremental
+ * shrinks don't strand page-aligned spans above highNBuffers (see comment
+ * inside).
+ */
+static Size
+BufferPoolArrayPhysicalShrink(void *baseptr, Size elem_size,
+ int lowNBuffers, int highNBuffers,
+ bool *success)
+{
+#ifdef __linux__
+ char *base;
+ Size off;
+ Size tail_len;
+ Size logical_len;
+ uintptr_t region_start;
+ uintptr_t region_end;
+ uintptr_t ms;
+ uintptr_t me;
+ Size os_page_size = buffer_pool_madvise_alignment();
+#endif
+
+ if (baseptr == NULL || elem_size == 0 || lowNBuffers >= highNBuffers)
+ return 0;
+
+#ifdef __linux__
+ base = (char *) baseptr;
+ Assert(os_page_size != 0);
+
+ off = mul_size((Size) lowNBuffers, elem_size);
+ /* See function header: tail spans up to MaxNBuffers, not highNBuffers. */
+ tail_len = mul_size((Size) (GetMaxNBuffers() - lowNBuffers), elem_size);
+ logical_len = mul_size((Size) (highNBuffers - lowNBuffers), elem_size);
+
+ region_start = (uintptr_t) (base + off);
+ region_end = region_start + tail_len;
+
+ /*
+ * MADV_REMOVE requires a page-aligned address and a multiple of the page
+ * size for length. Only full pages wholly inside the released logical
+ * range can be trimmed.
+ */
+ ms = TYPEALIGN((Size) os_page_size, region_start);
+ me = TYPEALIGN_DOWN((Size) os_page_size, region_end);
+
+ if (ms >= me)
+ return 0;
+
+ if (madvise((void *) ms, me - ms, MADV_REMOVE) < 0)
+ {
+ elog(WARNING, "madvise(MADV_REMOVE) on buffer pool array tail failed: %m");
+ *success = false;
+ return 0;
+ }
+
+#ifdef USE_VALGRIND
+ VALGRIND_MAKE_MEM_NOACCESS((void *) ms, me - ms);
+#endif
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * Catch stray reads/writes after shrink.
+ */
+ if (mprotect((void *) ms, me - ms, PROT_NONE) < 0 && errno != ENOMEM)
+ elog(WARNING, "mprotect(PROT_NONE) on buffer pool array tail: %m");
+#endif
+
+ return logical_len;
+#else
+ return 0; /* no local physical work off Linux */
+#endif
+}
/*
* Register shared memory area for the buffer pool.
@@ -76,26 +284,52 @@ const ShmemCallbacks BufferManagerShmemCallbacks = {
static void
BufferManagerShmemRequest(void *arg)
{
- ShmemRequestStruct(.name = "Buffer Descriptors",
- .size = NBuffers * sizeof(BufferDescPadded),
- /* Align descriptors to a cacheline boundary. */
- .alignment = PG_CACHE_LINE_SIZE,
- .ptr = (void **) &BufferDescriptors,
- );
+ bool foundBufs,
+ foundDescs,
+ foundIOCV,
+ foundBufCkpt;
+ int max_nbuffers;
+ Size os_page_size = buffer_pool_madvise_alignment();
+ Assert(os_page_size != 0);
+
+ if (enable_dynamic_shared_buffers)
+ {
+ if (MaxNBuffers == 0)
+ ereport(FATAL,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("max_shared_buffers must be set when enable_dynamic_shared_buffers is on"),
+ errhint("Set max_shared_buffers to a value at least as large as shared_buffers.")));
+ if (MaxNBuffers < NBuffersGUC)
+ ereport(FATAL,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("max_shared_buffers (%d) must be at least shared_buffers (%d) when enable_dynamic_shared_buffers is on",
+ MaxNBuffers, NBuffersGUC)));
+ }
+
+ max_nbuffers = GetMaxNBuffers();
+
+ /* Align descriptors for madvise (same granularity as buffer blocks). */
+ BufferDescriptors = (BufferDescPadded *)
+ TYPEALIGN(os_page_size,
+ ShmemInitStruct("Buffer Descriptors",
+ max_nbuffers * sizeof(BufferDescPadded) + 2 * os_page_size,
+ &foundDescs));
ShmemRequestStruct(.name = "Buffer Blocks",
.size = NBuffers * (Size) BLCKSZ,
/* Align buffer pool on IO page size boundary. */
- .alignment = PG_IO_ALIGN_SIZE,
- .ptr = (void **) &BufferBlocks,
- );
+ BufferBlocks = (char *)
+ TYPEALIGN(os_page_size,
+ ShmemInitStruct("Buffer Blocks",
+ max_nbuffers * (Size) BLCKSZ + 2 * os_page_size,
+ &foundBufs));
- ShmemRequestStruct(.name = "Buffer IO Condition Variables",
- .size = NBuffers * sizeof(ConditionVariableMinimallyPadded),
- /* Align descriptors to a cacheline boundary. */
- .alignment = PG_CACHE_LINE_SIZE,
- .ptr = (void **) &BufferIOCVArray,
- );
+ /* Align I/O condition variables for madvise. */
+ BufferIOCVArray = (ConditionVariableMinimallyPadded *)
+ TYPEALIGN(os_page_size,
+ ShmemInitStruct("Buffer IO Condition Variables",
+ max_nbuffers * sizeof(ConditionVariableMinimallyPadded) + 2 * os_page_size,
+ &foundIOCV));
/*
* The array used to sort to-be-checkpointed buffer ids is located in
@@ -104,11 +338,11 @@ BufferManagerShmemRequest(void *arg)
* the checkpointer is restarted, memory allocation failures would be
* painful.
*/
- ShmemRequestStruct(.name = "Checkpoint BufferIds",
- .size = NBuffers * sizeof(CkptSortItem),
- .ptr = (void **) &CkptBufferIds,
- );
-}
+ CkptBufferIds = (CkptSortItem *)
+ TYPEALIGN(os_page_size,
+ ShmemInitStruct("Checkpoint BufferIds",
+ max_nbuffers * sizeof(CkptSortItem) + 2 * os_page_size,
+ &foundBufCkpt));
/*
* Initialize shared buffer pool
@@ -124,30 +358,179 @@ BufferManagerShmemInit(void *arg)
*/
for (int i = 0; i < NBuffers; i++)
{
- BufferDesc *buf = GetBufferDescriptor(i);
-
- ClearBufferTag(&buf->tag);
+ int i;
- pg_atomic_init_u64(&buf->state, 0);
- buf->wait_backend_pgprocno = INVALID_PROC_NUMBER;
+ if (enable_dynamic_shared_buffers)
+ {
+ bool success = true;
- buf->buf_id = i;
+ /*
+ * Request physical memory for NBuffersGUC. A madvise failure
+ * here means we cannot eagerly populate the initial buffer
+ * pool; rather than start with possibly-unbacked memory and
+ * SIGBUS on first access, we PANIC so the postmaster fails
+ * to start cleanly.
+ */
+ BufferManagerShmemExpand(0, NBuffersGUC, &success);
+ if (!success)
+ elog(PANIC, "could not populate initial shared buffer pool: madvise(MADV_POPULATE_WRITE) failed");
+ }
- pgaio_wref_clear(&buf->io_wref);
-
- proclist_init(&buf->lock_waiters);
- ConditionVariableInit(BufferDescriptorGetIOCV(buf));
+ /*
+ * Initialize all the buffer headers for the active pool size.
+ * The clock sweep is the sole replacement mechanism, so there is
+ * no freelist to link them into.
+ */
+ for (i = 0; i < NBuffersGUC; i++)
+ InitializeBuffer(i);
}
/* Initialize per-backend file flush context */
WritebackContextInit(&BackendWritebackContext,
&backend_flush_after);
+
+ /*
+ * Initialize the DSB water marks. DSBCtrl is NULL in special contexts
+ * such as the WAL redo process, where DSB is not used.
+ */
+ if (DSBCtrl != NULL && !foundDescs)
+ {
+ pg_atomic_write_u32(&DSBCtrl->lowNBuffers, NBuffersGUC);
+ pg_atomic_write_u32(&DSBCtrl->highNBuffers, NBuffersGUC);
+ }
}
-static void
-BufferManagerShmemAttach(void *arg)
+/*
+ * BufferManagerShmemSize
+ *
+ * All buffer arrays are allocated in the single shared-memory heap. We size
+ * them to GetMaxNBuffers() so the pool fits its upper bound.
+ */
+Size
+BufferManagerShmemSize(void)
{
- /* Initialize per-backend file flush context */
- WritebackContextInit(&BackendWritebackContext,
- &backend_flush_after);
+ Size size = 0;
+ Size os_page_size = buffer_pool_madvise_alignment();
+ int max_nbuffers = GetMaxNBuffers();
+ Assert(os_page_size != 0);
+
+ /* size of buffer descriptors, plus alignment padding for madvise */
+ size = add_size(size, mul_size(max_nbuffers, sizeof(BufferDescPadded)));
+ size = add_size(size, mul_size(2, os_page_size));
+
+ /* size of data pages, plus alignment padding */
+ size = add_size(size, mul_size(2, os_page_size));
+ size = add_size(size, mul_size(max_nbuffers, (Size) BLCKSZ));
+
+ /* size of stuff controlled by freelist.c */
+ size = add_size(size, StrategyShmemSize());
+
+ /* size of I/O condition variables, plus alignment padding for madvise */
+ size = add_size(size, mul_size(max_nbuffers,
+ sizeof(ConditionVariableMinimallyPadded)));
+ size = add_size(size, mul_size(2, os_page_size));
+
+ /* size of checkpoint sort array in bufmgr.c, plus alignment padding */
+ size = add_size(size, mul_size(max_nbuffers, sizeof(CkptSortItem)));
+ size = add_size(size, mul_size(2, os_page_size));
+
+ return size;
+}
+
+/*
+ * Allocate backing memory pages from the OS for the buffer-pool slice
+ * [lowNBuffers, highNBuffers).
+ *
+ * Returns the sum, over the four buffer-pool arrays, of the exact byte length
+ * each BufferPoolArrayPhysicalExpand call touched. *success is set to true on
+ * success, or false if any madvise(MADV_POPULATE_WRITE) call failed.
+ */
+Size
+BufferManagerShmemExpand(int lowNBuffers, int highNBuffers, bool *success)
+{
+ int max_nbuffers = GetMaxNBuffers();
+ Size total = 0;
+
+ if (highNBuffers > max_nbuffers)
+ elog(PANIC, "buffer pool expand exceeds allocation (low=%d high=%d allocated=%d)",
+ lowNBuffers, highNBuffers, max_nbuffers);
+
+ Assert(lowNBuffers < highNBuffers);
+
+ *success = true;
+
+ total = add_size(total, BufferPoolArrayPhysicalExpand(BufferDescriptors, sizeof(BufferDescPadded),
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalExpand(BufferBlocks, (Size) BLCKSZ,
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalExpand(BufferIOCVArray,
+ sizeof(ConditionVariableMinimallyPadded),
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalExpand(CkptBufferIds, sizeof(CkptSortItem),
+ lowNBuffers, highNBuffers, success));
+ return total;
+}
+
+void
+BufferManagerShmemInitBuffers(int lowNBuffers, int highNBuffers)
+{
+ int i;
+
+ /* Clock sweep will pick up the new buffers; nothing else to do. */
+ for (i = lowNBuffers; i < highNBuffers; i++)
+ InitializeBuffer(i);
+}
+
+/*
+ * Release the buffer-pool slice [lowNBuffers, highNBuffers) back to the OS
+ * when shrinking the pool.
+ *
+ * Returns the sum, over the four buffer-pool arrays, of the exact byte length
+ * each BufferPoolArrayPhysicalShrink call touched. *success is set to true on
+ * success, or false if any madvise() call failed; on failure we stop after the
+ * failing array (so total reflects only the arrays that were fully released).
+ */
+Size
+BufferManagerShmemShrink(int lowNBuffers, int highNBuffers, bool *success)
+{
+ int max_nbuffers = GetMaxNBuffers();
+ Size total = 0;
+
+ if (highNBuffers > max_nbuffers)
+ elog(PANIC, "buffer pool shrink exceeds allocation (low=%d high=%d allocated=%d)",
+ lowNBuffers, highNBuffers, max_nbuffers);
+
+ Assert(lowNBuffers < highNBuffers);
+
+ *success = true;
+
+ total = add_size(total, BufferPoolArrayPhysicalShrink(BufferDescriptors, sizeof(BufferDescPadded),
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalShrink(BufferBlocks, (Size) BLCKSZ,
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalShrink(BufferIOCVArray,
+ sizeof(ConditionVariableMinimallyPadded),
+ lowNBuffers, highNBuffers, success));
+ if (!*success)
+ return total;
+
+ total = add_size(total, BufferPoolArrayPhysicalShrink(CkptBufferIds, sizeof(CkptSortItem),
+ lowNBuffers, highNBuffers, success));
+
+ return total;
}
diff --git a/src/backend/storage/buffer/buf_resize.c b/src/backend/storage/buffer/buf_resize.c
new file mode 100644
index 00000000000..afb5268f636
--- /dev/null
+++ b/src/backend/storage/buffer/buf_resize.c
@@ -0,0 +1,544 @@
+/*-------------------------------------------------------------------------
+ *
+ * buf_resize.c
+ * Online resize coordinator for the shared buffer pool.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/storage/buffer/buf_resize.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include <math.h>
+#include <signal.h>
+
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "portability/instr_time.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/interrupt.h"
+#include "storage/buf_internals.h"
+#include "storage/bufmgr.h"
+#include "storage/dynamic_shared_buffers.h"
+#include "storage/ipc.h"
+#include "storage/latch.h"
+#include "storage/lwlock.h"
+#include "storage/pg_shmem.h"
+#include "storage/pmsignal.h"
+#include "storage/procsignal.h"
+#include "storage/shmem.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/injection_point.h"
+
+PG_FUNCTION_INFO_V1(pg_resize_shared_buffers);
+
+/*
+ * `coordinator_active` tells the cleanup callback whether *this* backend
+ * currently holds the resize_in_progress flag.
+ *
+ * `cleanup_registered` ensures we only call before_shmem_exit() once per
+ * backend lifetime.
+ *
+ * `inflight_expand_target` is non-zero when DoExpand starts. The cleanup
+ * callback uses it to surface a WARNING if an expand was interrupted.
+ */
+static volatile bool coordinator_active = false;
+static volatile bool cleanup_registered = false;
+static volatile int inflight_expand_target = 0;
+
+/*
+ * Emit a (key, value, unit) tuple to the function's result set. If value_null
+ * is true, value and unit are emitted as NULL.
+ *
+ * Used to return tuples from the pg_resize_shared_buffers() function. The
+ * tupledesc of the returned rows must match the function's OUT arguments.
+ */
+static void
+EmitResizeMetricRow(ReturnSetInfo *rsinfo, const char *key, double value,
+ const char *unit, bool value_null)
+{
+ Datum values[3];
+ bool nulls[3];
+
+ values[0] = CStringGetTextDatum(key);
+ nulls[0] = false;
+ if (value_null)
+ {
+ nulls[1] = true;
+ values[1] = (Datum) 0;
+ nulls[2] = true;
+ values[2] = (Datum) 0;
+ }
+ else
+ {
+ nulls[1] = false;
+ nulls[2] = false;
+ values[1] = Float8GetDatum(value);
+ values[2] = CStringGetTextDatum(unit);
+ }
+ tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
+}
+
+/* Rounded timing row; value is in seconds, unit is "seconds". */
+static void
+EmitResizeTimeRow(ReturnSetInfo *rsinfo, const char *key, double elapsed_sec)
+{
+ double t = round(elapsed_sec * 100.0) / 100.0;
+
+ EmitResizeMetricRow(rsinfo, key, t, "seconds", false);
+}
+
+static void
+EmitResizeBytesRow(ReturnSetInfo *rsinfo, const char *key, double bytes)
+{
+ EmitResizeMetricRow(rsinfo, key, bytes, "bytes", false);
+}
+
+static void
+SharedBufferResizeBarrier(ProcSignalBarrierType barrier, const char *barrier_name)
+{
+ WaitForProcSignalBarrier(EmitProcSignalBarrier(barrier));
+ elog(LOG, "all backends acknowledged %s barrier", barrier_name);
+}
+
+/*
+ * Parse a user-supplied size string (e.g. "256MB", "32768") into a number of
+ * shared buffers. Raises ERROR on invalid input or out-of-range size.
+ */
+static int
+ParseNewSize(const char *new_size_str)
+{
+ const char *hintmsg = NULL;
+ int new_size;
+
+ if (!parse_int(new_size_str, &new_size, GUC_UNIT_BLOCKS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for shared_buffers: \"%s\"", new_size_str),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ if (new_size < MIN_SHARED_BUFFERS)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("shared_buffers must be at least %d, got %d",
+ MIN_SHARED_BUFFERS, new_size)));
+
+ if (new_size > GetMaxNBuffers())
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("shared_buffers (%d) cannot exceed max_shared_buffers (%d)",
+ new_size, GetMaxNBuffers())));
+
+ return new_size;
+}
+
+/*
+ * Sleep up to `timeout_ms` milliseconds.
+ */
+static void
+ResizeWaitMs(int timeout_ms)
+{
+ int rc;
+
+ rc = WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ timeout_ms,
+ WAIT_EVENT_PG_SLEEP);
+ if (rc & WL_LATCH_SET)
+ ResetLatch(MyLatch);
+ CHECK_FOR_INTERRUPTS();
+}
+
+/*
+ * Shrink protocol: lower lowNBuffers first to restrict allocations, evict
+ * the [low, high) range, then drop highNBuffers to lowNBuffers, and only then
+ * release the OS-level memory backing that range.
+ *
+ * - pre: lowNBuffers == highNBuffers == old_size > new_size
+ * - post (success):
+ * lowNBuffers == highNBuffers == new_size
+ * - post (interrupted before highNBuffers is lowered):
+ * lowNBuffers == new_size, highNBuffers == old_size
+ * (recoverable: ResetResizeInProgress() rolls lowNBuffers back to high)
+ * - post (madvise(MADV_REMOVE) failure during BufferManagerShmemShrink):
+ * lowNBuffers == highNBuffers == new_size. We cannot roll back the shrink.
+ *
+ * Raises ERROR on unrecoverable failure.
+ */
+static void
+DoShrink(ReturnSetInfo *rsinfo, int old_size, int new_size)
+{
+ instr_time phase_start;
+ instr_time phase_end;
+ Size mem_bytes;
+ bool shrink_success;
+
+ Assert(new_size < old_size);
+ Assert(pg_atomic_read_u32(&DSBCtrl->lowNBuffers) ==
+ pg_atomic_read_u32(&DSBCtrl->highNBuffers));
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Reset the clock-sweep cursor before lowering the low water mark. The
+ * existing cursor may point above new_size. Once we publish the new
+ * lowNBuffers, ClockSweepTick() may otherwise immediately wrap past
+ * the new buffers via modulo arithmetic. Resetting to 0 means the
+ * next sweep starts from the bottom of the surviving range.
+ */
+ StrategyReset(old_size, new_size);
+
+ elog(LOG, "[Shrink Barrier]: restricting allocations to %d buffers", new_size);
+ INSTR_TIME_SET_CURRENT(phase_start);
+ /*
+ * Wait for all backends to acknowledge the new lowNBuffers. After the
+ * barrier returns, all new buffer allocations will land in [0, lowNBuffers)
+ * range. For buffers in [lowNBuffers, highNBuffers), backends can
+ * hold pins and create new pins on buffers already pinned.
+ * The EvictExtraBuffers() loop below will wait for all buffers in
+ * [lowNBuffers, highNBuffers) to be unpinned.
+ */
+ SharedBufferResizeBarrier(PROCSIGNAL_BARRIER_SHBUF_RESIZE, CppAsString(PROCSIGNAL_BARRIER_SHBUF_RESIZE));
+ INSTR_TIME_SET_CURRENT(phase_end);
+ INSTR_TIME_SUBTRACT(phase_end, phase_start);
+ EmitResizeTimeRow(rsinfo, "Barrier", INSTR_TIME_GET_DOUBLE(phase_end));
+ elog(LOG, "[Shrink Barrier]: Restricted allocations to %d buffers in %f seconds", new_size, INSTR_TIME_GET_DOUBLE(phase_end));
+
+ INJECTION_POINT("buf-resize-shrink-after-barrier", NULL);
+
+ /*
+ * Evict all pages in [lowNBuffers, highNBuffers).
+ */
+ elog(LOG, "[Shrink]: evicting buffers %u..%u", new_size, old_size);
+ INSTR_TIME_SET_CURRENT(phase_start);
+ {
+ instr_time last_log;
+
+ INSTR_TIME_SET_CURRENT(last_log);
+ while (!EvictExtraBuffers(new_size, old_size))
+ {
+ instr_time now;
+
+ ResizeWaitMs(100);
+
+ INSTR_TIME_SET_CURRENT(now);
+ INSTR_TIME_SUBTRACT(now, last_log);
+ if (INSTR_TIME_GET_DOUBLE(now) >= 5.0)
+ {
+ elog(LOG, "still waiting for buffers to be unpinned");
+ INSTR_TIME_SET_CURRENT(last_log);
+ }
+ }
+ }
+ INSTR_TIME_SET_CURRENT(phase_end);
+ INSTR_TIME_SUBTRACT(phase_end, phase_start);
+ EmitResizeTimeRow(rsinfo, "Buffer relocation", INSTR_TIME_GET_DOUBLE(phase_end));
+ elog(LOG, "[Shrink]: evicted %d buffers in %f seconds", old_size - new_size, INSTR_TIME_GET_DOUBLE(phase_end));
+
+ CHECK_FOR_INTERRUPTS();
+ /*
+ * All the victim buffers are now empty and won't be allocated by backends.
+ * Take AccessNBuffersLock in exclusive mode so we wait for any backend
+ * still iterating with the old highNBuffers and prevent new
+ * ones from starting until we published the new highNBuffers.
+ */
+ INSTR_TIME_SET_CURRENT(phase_start);
+ LWLockAcquire(&DSBCtrl->AccessNBuffersLock, LW_EXCLUSIVE);
+ pg_atomic_write_u32(&DSBCtrl->highNBuffers, new_size);
+ LWLockRelease(&DSBCtrl->AccessNBuffersLock);
+
+ INJECTION_POINT("buf-resize-shrink-before-madvise", NULL);
+
+ /*
+ * Release the memory.
+ */
+ mem_bytes = BufferManagerShmemShrink(new_size, old_size, &shrink_success);
+ INSTR_TIME_SET_CURRENT(phase_end);
+ INSTR_TIME_SUBTRACT(phase_end, phase_start);
+ EmitResizeTimeRow(rsinfo, "Shrink shmem", INSTR_TIME_GET_DOUBLE(phase_end));
+ EmitResizeBytesRow(rsinfo, "Shrink shmem", (double) mem_bytes);
+ elog(LOG, "[Shrink]: released %zu bytes of memory in %f seconds", mem_bytes, INSTR_TIME_GET_DOUBLE(phase_end));
+
+ if (!shrink_success)
+ ereport(ERROR,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("shared_buffers shrink from %d to %d failed",
+ old_size, new_size),
+ errdetail("madvise(MADV_REMOVE) failed while releasing buffer-pool memory; the failure is not recoverable."),
+ errhint("Check the server log for the underlying madvise() error.")));
+}
+
+/*
+ * Expand protocol: allocate memory for the [old_size, new_size) range,
+ * initialize the new buffer descriptors, then publish both new lowNBuffers
+ * and highNBuffers atomically under the exclusive lock.
+ *
+ * - pre: lowNBuffers == highNBuffers == old_size < new_size
+ * - post (success):
+ * lowNBuffers == highNBuffers == new_size
+ * - post (madvise(MADV_POPULATE_WRITE) failure during BufferManagerShmemExpand):
+ * lowNBuffers == highNBuffers == old_size (water marks NOT advanced)
+ * Some bytes in [old_size, new_size) of the four buffer-pool arrays may
+ * have been allocated from the OS but never published to backends.
+ *
+ * Raises ERROR on madvise failure.
+ */
+static void
+DoExpand(ReturnSetInfo *rsinfo, int old_size, int new_size)
+{
+ instr_time phase_start;
+ instr_time phase_end;
+ Size mem_bytes;
+ bool expand_success;
+
+ Assert(new_size > old_size);
+ Assert(pg_atomic_read_u32(&DSBCtrl->lowNBuffers) ==
+ pg_atomic_read_u32(&DSBCtrl->highNBuffers));
+
+ INSTR_TIME_SET_CURRENT(phase_start);
+
+ inflight_expand_target = new_size;
+
+ /*
+ * Allocate physical memory and initialize the new buffer descriptors
+ * BEFORE acquiring AccessNBuffersLock. Backends iterating the buffer
+ * pool only look at [0, highNBuffers); since highNBuffers is still at
+ * old_size, the new range is invisible to them, so it is safe to touch
+ * without the lock.
+ */
+ mem_bytes = BufferManagerShmemExpand(old_size, new_size, &expand_success);
+ if (!expand_success)
+ {
+ INSTR_TIME_SET_CURRENT(phase_end);
+ INSTR_TIME_SUBTRACT(phase_end, phase_start);
+ EmitResizeTimeRow(rsinfo, "Expand shmem", INSTR_TIME_GET_DOUBLE(phase_end));
+ EmitResizeBytesRow(rsinfo, "Expand shmem", (double) mem_bytes);
+ ereport(ERROR,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("shared_buffers expand from %d to %d failed",
+ old_size, new_size),
+ errdetail("madvise(MADV_POPULATE_WRITE) failed while populating buffer-pool memory; the new range was not made visible to backends."),
+ errhint("Check the server log for the underlying madvise() error and retry.")));
+ }
+
+ BufferManagerShmemInitBuffers(old_size, new_size);
+
+ INJECTION_POINT("buf-resize-expand-before-publish", NULL);
+
+ /*
+ * Hold AccessNBuffersLock in exclusive mode while we publish the new
+ * water marks. Backends taking the lock in shared mode (e.g. via
+ * BEGIN_NBUFFERS_ACCESS) either run entirely before this critical
+ * section and see lowNBuffers == highNBuffers == old_size, or entirely
+ * after and see lowNBuffers == highNBuffers == new_size with valid
+ * memory; they never observe the partially initialized intermediate
+ * state. Concurrent atomics readers (clock sweep / freelist) may
+ * briefly see lowNBuffers < highNBuffers between the two writes below;
+ * that is fine because both bounds are now backed by initialized
+ * memory, so a clock sweep wrapping into the [old_size, new_size) range
+ * is safe.
+ */
+ LWLockAcquire(&DSBCtrl->AccessNBuffersLock, LW_EXCLUSIVE);
+ /*
+ * Reset the clock-sweep cursor to the start of the new buffers so the
+ * next clock pass tries the freshly added empty buffers before
+ * re-scanning existing ones with usage_count == 0.
+ */
+ StrategyReset(old_size, new_size);
+ LWLockRelease(&DSBCtrl->AccessNBuffersLock);
+
+ /*
+ * The expand is complete.
+ */
+ inflight_expand_target = 0;
+
+ INSTR_TIME_SET_CURRENT(phase_end);
+ INSTR_TIME_SUBTRACT(phase_end, phase_start);
+ EmitResizeTimeRow(rsinfo, "Expand shmem", INSTR_TIME_GET_DOUBLE(phase_end));
+ EmitResizeBytesRow(rsinfo, "Expand shmem", (double) mem_bytes);
+ elog(LOG, "[Expand]: expanded buffer pool memory with %zu bytes in %f seconds", mem_bytes, INSTR_TIME_GET_DOUBLE(phase_end));
+}
+
+/*
+ * Cleanup callback. Runs from the transaction-abort PG_CATCH path *and* from
+ * before_shmem_exit() if the backend dies while holding the resize slot.
+ *
+ * Rollback policy:
+ * - Partial shrink (lowNBuffers < highNBuffers): restore lowNBuffers to
+ * highNBuffers so the buffer pool is consistent at the larger size.
+ * Memory for [lowNBuffers, highNBuffers) is still mapped, so rolling
+ * back is safe.
+ * - Partial expand: BufferManagerShmemExpand() may have populated some of
+ * [old_size, inflight_expand_target) without publishing the new water
+ * marks. This is wasteful but harmless. We surface a WARNING so operators
+ * know to retry the resize.
+ */
+static void
+ResetResizeInProgress(int code, Datum arg)
+{
+ uint32 high;
+ uint32 low;
+ int expand_target;
+ bool shrink_failed = false;
+ bool expand_failed = false;
+
+ if (!coordinator_active || DSBCtrl == NULL)
+ return;
+
+ Assert(DSBCtrl->resize_in_progress);
+ Assert(DSBCtrl->coordinator_pid == MyProcPid);
+
+ coordinator_active = false;
+
+ high = pg_atomic_read_u32(&DSBCtrl->highNBuffers);
+ low = pg_atomic_read_u32(&DSBCtrl->lowNBuffers);
+ if (low < high)
+ {
+ shrink_failed = true;
+ pg_atomic_write_u32(&DSBCtrl->lowNBuffers, high);
+ }
+
+ expand_target = inflight_expand_target;
+ if (expand_target != 0)
+ {
+ expand_failed = true;
+ inflight_expand_target = 0;
+ }
+
+ ReleaseResizeCoordinator();
+
+ /*
+ * Emit user-visible warnings AFTER all critical cleanup.
+ */
+ if (shrink_failed)
+ ereport(WARNING,
+ (errmsg("shared_buffers shrink was interrupted; rolling back lowNBuffers from %u to %u",
+ (unsigned) low, (unsigned) high)));
+
+ if (expand_failed)
+ ereport(WARNING,
+ (errmsg("shared_buffers expand to %d was interrupted",
+ expand_target),
+ errdetail("Some memory in [%u, %d) may have been allocated from the OS but not made visible to backends. It will sit unused in shmem until a future successful resize re-initializes that range.",
+ (unsigned) high, expand_target)));
+}
+
+Datum
+pg_resize_shared_buffers(PG_FUNCTION_ARGS)
+{
+ instr_time total_start;
+ instr_time total_end;
+
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ int old_size;
+ int new_size;
+ char *new_size_str;
+
+ INSTR_TIME_SET_CURRENT(total_start);
+
+ if (!enable_dynamic_shared_buffers)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("shared buffer pool resizing requires enable_dynamic_shared_buffers")));
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to resize shared_buffers")));
+
+ if (PG_NARGS() != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("pg_resize_shared_buffers requires exactly one argument (the new shared_buffers value)")));
+ if (PG_ARGISNULL(0))
+ ereport(ERROR,
+ (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ errmsg("new_size argument to pg_resize_shared_buffers must not be NULL")));
+
+ /*
+ * Restrict callers to regular client backends.
+ */
+ Assert(MyBackendType == B_BACKEND);
+
+ /*
+ * Parse the requested size first so we fail fast on bad input before
+ * claiming the resize_in_progress flag.
+ */
+ new_size_str = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ new_size = ParseNewSize(new_size_str);
+
+ InitMaterializedSRF(fcinfo, 0);
+
+ /*
+ * Register the FATAL-exit cleanup once per backend lifetime.
+ */
+ if (!cleanup_registered)
+ {
+ before_shmem_exit(ResetResizeInProgress, (Datum) 0);
+ cleanup_registered = true;
+ }
+
+ if (!ClaimResizeCoordinator())
+ {
+ elog(LOG, "shared buffer resizing is already in progress");
+ EmitResizeMetricRow(rsinfo, "resize already in progress", 0, NULL, true);
+ return (Datum) 0;
+ }
+
+ /*
+ * Mark this backend as the local coordinator so the cleanup callback
+ * knows to release the shared slot on error / exit.
+ */
+ coordinator_active = true;
+
+ PG_TRY();
+ {
+ INJECTION_POINT("buf-resize-after-claim", NULL);
+
+ old_size = pg_atomic_read_u32(&DSBCtrl->lowNBuffers);
+ /*
+ * The highNBuffers should be equal to lowNBuffers.
+ */
+ Assert(pg_atomic_read_u32(&DSBCtrl->highNBuffers) == old_size);
+
+ if (old_size == new_size)
+ {
+ elog(LOG, "shared buffers are already at %d, no need to resize", old_size);
+ EmitResizeTimeRow(rsinfo, "No resize", 0.0);
+ }
+ else
+ {
+ elog(LOG, "resizing shared buffers from %d to %d", old_size, new_size);
+
+ if (new_size < old_size)
+ DoShrink(rsinfo, old_size, new_size);
+ else
+ DoExpand(rsinfo, old_size, new_size);
+
+ Assert(pg_atomic_read_u32(&DSBCtrl->lowNBuffers) == (uint32) new_size);
+ Assert(pg_atomic_read_u32(&DSBCtrl->highNBuffers) == (uint32) new_size);
+
+ INSTR_TIME_SET_CURRENT(total_end);
+ INSTR_TIME_SUBTRACT(total_end, total_start);
+ EmitResizeTimeRow(rsinfo, "Total Resize Time", INSTR_TIME_GET_DOUBLE(total_end));
+ elog(LOG, "successfully resized shared buffers to %d", new_size);
+ }
+ }
+ PG_CATCH();
+ {
+ ResetResizeInProgress(0, (Datum) 0);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ ResetResizeInProgress(0, (Datum) 0);
+ return (Datum) 0;
+}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index cc398db124d..6a7302917a1 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -58,6 +58,7 @@
#include "storage/fd.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
+#include "storage/pg_shmem.h"
#include "storage/proc.h"
#include "storage/proclist.h"
#include "storage/procsignal.h"
@@ -92,7 +93,7 @@
* being dropped. For the relations with size below this threshold, we find
* the buffers by doing lookups in BufMapping table.
*/
-#define BUF_DROP_FULL_SCAN_THRESHOLD (uint64) (NBuffers / 32)
+#define BUF_DROP_FULL_SCAN_THRESHOLD (uint64) (GetHighNBuffers() / 32)
/*
* This is separated out from PrivateRefCountEntry to allow for copying all
@@ -633,7 +634,9 @@ static bool PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy,
static void PinBuffer_Locked(BufferDesc *buf);
static void UnpinBuffer(BufferDesc *buf);
static void UnpinBufferNoOwner(BufferDesc *buf);
-static void BufferSync(int flags);
+static bool EvictUnpinnedBufferInternal(BufferDesc *desc, bool *buffer_flushed);
+static void BufferSync(int flags, int localNBuffers);
+static uint32 WaitBufHdrUnlocked(BufferDesc *buf);
static int SyncOneBuffer(int buf_id, bool skip_recently_used,
WritebackContext *wb_context);
static void WaitIO(BufferDesc *buf);
@@ -847,6 +850,12 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
}
else
{
+ /*
+ * Reject any recent_buffer that points above the live low watermark.
+ */
+ if (recent_buffer > GetLowNBuffers())
+ return false;
+
bufHdr = GetBufferDescriptor(recent_buffer - 1);
/*
@@ -2707,6 +2716,7 @@ uint32
GetAdditionalPinLimit(void)
{
uint32 estimated_pins_held;
+ uint32 limit;
/*
* We get the number of "overflowed" pins for free, but don't know the
@@ -2715,11 +2725,17 @@ GetAdditionalPinLimit(void)
*/
estimated_pins_held = PrivateRefCountOverflowed + REFCOUNT_ARRAY_ENTRIES;
+ /*
+ * Consult get_pin_limit_hook so the per-backend limit tracks the live
+ * buffer pool size.
+ */
+ limit = enable_dynamic_shared_buffers && get_pin_limit_hook ? get_pin_limit_hook() : MaxProportionalPins;
+
/* Is this backend already holding more than its fair share? */
- if (estimated_pins_held > MaxProportionalPins)
+ if (estimated_pins_held > limit)
return 0;
- return MaxProportionalPins - estimated_pins_held;
+ return limit - estimated_pins_held;
}
/*
@@ -3558,7 +3574,7 @@ TrackNewBufferPin(Buffer buf)
* currently have no effect here.
*/
static void
-BufferSync(int flags)
+BufferSync(int flags, int localNBuffers)
{
uint64 buf_state;
int buf_id;
@@ -3599,7 +3615,7 @@ BufferSync(int flags)
* certainly need to be written for the next checkpoint attempt, too.
*/
num_to_scan = 0;
- for (buf_id = 0; buf_id < NBuffers; buf_id++)
+ for (buf_id = 0; buf_id < localNBuffers; buf_id++)
{
BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
uint64 set_bits = 0;
@@ -3628,7 +3644,7 @@ BufferSync(int flags)
set_bits, 0,
0);
- /* Check for barrier events in case NBuffers is large. */
+ /* Check for barrier events in case the buffer pool is large. */
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
}
@@ -3638,7 +3654,7 @@ BufferSync(int flags)
WritebackContextInit(&wb_context, &checkpoint_flush_after);
- TRACE_POSTGRESQL_BUFFER_SYNC_START(NBuffers, num_to_scan);
+ TRACE_POSTGRESQL_BUFFER_SYNC_START(localNBuffers, num_to_scan);
/*
* Sort buffers that need to be written to reduce the likelihood of random
@@ -3822,7 +3838,7 @@ BufferSync(int flags)
*/
CheckpointStats.ckpt_bufs_written += num_written;
- TRACE_POSTGRESQL_BUFFER_SYNC_DONE(NBuffers, num_written, num_to_scan);
+ TRACE_POSTGRESQL_BUFFER_SYNC_DONE(localNBuffers, num_written, num_to_scan);
}
/*
@@ -3881,10 +3897,28 @@ BgBufferSync(WritebackContext *wb_context)
uint32 new_recent_alloc;
/*
- * Find out where the clock-sweep currently is, and how many buffer
- * allocations have happened since our last call.
+ * Snapshot of lowNBuffers from the previous invocation. Whenever the
+ * value changes a buffer-pool resize has happened: the smoothed
+ * allocation rate / clean-buffer density we accumulated for the old size
+ * are no longer meaningful, so we invalidate saved_info_valid and start
+ * fresh.
*/
- strategy_buf_id = StrategySyncStart(&strategy_passes, &recent_alloc);
+ static int saved_low_nbuffers = 0;
+ int current_low_nbuffers;
+
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+
+ strategy_buf_id = StrategySyncStart(&strategy_passes, &recent_alloc,
+ ¤t_low_nbuffers);
+ if (current_low_nbuffers != saved_low_nbuffers)
+ {
+#ifdef BGW_DEBUG
+ elog(DEBUG2, "invalidated background writer state after pool resize: %d -> %d buffers",
+ saved_low_nbuffers, current_low_nbuffers);
+#endif
+ saved_info_valid = false;
+ saved_low_nbuffers = current_low_nbuffers;
+ }
/* Report buffer alloc counts to pgstat */
PendingBgWriterStats.buf_alloc += recent_alloc;
@@ -3897,6 +3931,7 @@ BgBufferSync(WritebackContext *wb_context)
if (bgwriter_lru_maxpages <= 0)
{
saved_info_valid = false;
+ END_NBUFFERS_ACCESS(localNBuffers);
return true;
}
@@ -3913,7 +3948,7 @@ BgBufferSync(WritebackContext *wb_context)
int32 passes_delta = strategy_passes - prev_strategy_passes;
strategy_delta = strategy_buf_id - prev_strategy_buf_id;
- strategy_delta += (long) passes_delta * NBuffers;
+ strategy_delta += (long) passes_delta * localNBuffers;
Assert(strategy_delta >= 0);
@@ -3932,7 +3967,7 @@ BgBufferSync(WritebackContext *wb_context)
next_to_clean >= strategy_buf_id)
{
/* on same pass, but ahead or at least not behind */
- bufs_to_lap = NBuffers - (next_to_clean - strategy_buf_id);
+ bufs_to_lap = localNBuffers - (next_to_clean - strategy_buf_id);
#ifdef BGW_DEBUG
elog(DEBUG2, "bgwriter ahead: bgw %u-%u strategy %u-%u delta=%ld lap=%d",
next_passes, next_to_clean,
@@ -3954,7 +3989,7 @@ BgBufferSync(WritebackContext *wb_context)
#endif
next_to_clean = strategy_buf_id;
next_passes = strategy_passes;
- bufs_to_lap = NBuffers;
+ bufs_to_lap = localNBuffers;
}
}
else
@@ -3970,7 +4005,7 @@ BgBufferSync(WritebackContext *wb_context)
strategy_delta = 0;
next_to_clean = strategy_buf_id;
next_passes = strategy_passes;
- bufs_to_lap = NBuffers;
+ bufs_to_lap = localNBuffers;
}
/* Update saved info for next time */
@@ -3996,7 +4031,7 @@ BgBufferSync(WritebackContext *wb_context)
* strategy point and where we've scanned ahead to, based on the smoothed
* density estimate.
*/
- bufs_ahead = NBuffers - bufs_to_lap;
+ bufs_ahead = localNBuffers - bufs_to_lap;
reusable_buffers_est = (float) bufs_ahead / smoothed_density;
/*
@@ -4034,7 +4069,7 @@ BgBufferSync(WritebackContext *wb_context)
* the BGW will be called during the scan_whole_pool time; slice the
* buffer pool into that many sections.
*/
- min_scan_buffers = (int) (NBuffers / (scan_whole_pool_milliseconds / BgWriterDelay));
+ min_scan_buffers = (int) (localNBuffers / (scan_whole_pool_milliseconds / BgWriterDelay));
if (upcoming_alloc_est < (min_scan_buffers + reusable_buffers_est))
{
@@ -4062,7 +4097,7 @@ BgBufferSync(WritebackContext *wb_context)
int sync_state = SyncOneBuffer(next_to_clean, true,
wb_context);
- if (++next_to_clean >= NBuffers)
+ if (++next_to_clean >= localNBuffers)
{
next_to_clean = 0;
next_passes++;
@@ -4116,6 +4151,8 @@ BgBufferSync(WritebackContext *wb_context)
#endif
}
+ END_NBUFFERS_ACCESS(localNBuffers);
+
/* Return true if OK to hibernate */
return (bufs_to_lap == 0 && recent_alloc == 0);
}
@@ -4231,7 +4268,7 @@ InitBufferManagerAccess(void)
* allow plenty of pins. LimitAdditionalPins() and
* GetAdditionalPinLimit() can be used to check the remaining balance.
*/
- MaxProportionalPins = NBuffers / (MaxBackends + NUM_AUXILIARY_PROCS);
+ MaxProportionalPins = GetMaxNBuffers() / (MaxBackends + NUM_AUXILIARY_PROCS);
memset(&PrivateRefCountArray, 0, sizeof(PrivateRefCountArray));
memset(&PrivateRefCountArrayKeys, 0, sizeof(PrivateRefCountArrayKeys));
@@ -4371,6 +4408,12 @@ AssertNotCatalogBufferLock(Buffer buffer, BufferLockMode mode)
if (mode != BUFFER_LOCK_EXCLUSIVE)
return;
+ if (!((BufferDescPadded *) lock > BufferDescriptors &&
+ (BufferDescPadded *) lock < BufferDescriptors + GetMaxNBuffers()))
+ return; /* not a buffer lock */
+
+ bufHdr = (BufferDesc *)
+ ((char *) lock - offsetof(BufferDesc, content_lock));
tag = bufHdr->tag;
/*
@@ -4440,7 +4483,9 @@ DebugPrintBufferRefcount(Buffer buffer)
void
CheckPointBuffers(int flags)
{
- BufferSync(flags);
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+ BufferSync(flags, localNBuffers);
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/*
@@ -4779,6 +4824,7 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
RelFileLocatorBackend rlocator;
BlockNumber nForkBlock[MAX_FORKNUM];
uint64 nBlocksToInvalidate = 0;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
rlocator = smgr_reln->smgr_rlocator;
@@ -4842,7 +4888,7 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
return;
}
- for (i = 0; i < NBuffers; i++)
+ for (i = 0; i < localNBuffers; i++)
{
BufferDesc *bufHdr = GetBufferDescriptor(i);
@@ -4880,6 +4926,7 @@ DropRelationBuffers(SMgrRelation smgr_reln, ForkNumber *forkNum,
if (j >= nforks)
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/* ---------------------------------------------------------------------
@@ -4901,6 +4948,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
RelFileLocator *locators;
bool cached = true;
bool use_bsearch;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
if (nlocators == 0)
return;
@@ -5003,7 +5051,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
if (use_bsearch)
qsort(locators, n, sizeof(RelFileLocator), rlocator_comparator);
- for (i = 0; i < NBuffers; i++)
+ for (i = 0; i < localNBuffers; i++)
{
RelFileLocator *rlocator = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
@@ -5046,6 +5094,7 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators)
else
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
pfree(locators);
pfree(rels);
@@ -5130,7 +5179,8 @@ DropDatabaseBuffers(Oid dbid)
* database isn't our own.
*/
- for (i = 0; i < NBuffers; i++)
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+ for (i = 0; i < localNBuffers; i++)
{
BufferDesc *bufHdr = GetBufferDescriptor(i);
@@ -5147,6 +5197,7 @@ DropDatabaseBuffers(Oid dbid)
else
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/* ---------------------------------------------------------------------
@@ -5174,7 +5225,9 @@ FlushRelationBuffers(Relation rel)
BufferDesc *bufHdr;
SMgrRelation srel = RelationGetSmgr(rel);
- if (RelationUsesLocalBuffers(rel))
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+
+ if (RelationUsesLocalBuffers(rel) || am_wal_redo_postgres)
{
for (i = 0; i < NLocBuffer; i++)
{
@@ -5213,10 +5266,12 @@ FlushRelationBuffers(Relation rel)
}
}
+ END_NBUFFERS_ACCESS(localNBuffers);
+
return;
}
- for (i = 0; i < NBuffers; i++)
+ for (i = 0; i < localNBuffers; i++)
{
uint64 buf_state;
@@ -5244,6 +5299,7 @@ FlushRelationBuffers(Relation rel)
else
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/* ---------------------------------------------------------------------
@@ -5261,6 +5317,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
int i;
SMgrSortArray *srels;
bool use_bsearch;
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
if (nrels == 0)
return;
@@ -5286,7 +5343,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
if (use_bsearch)
qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
- for (i = 0; i < NBuffers; i++)
+ for (i = 0; i < localNBuffers; i++)
{
SMgrSortArray *srelent = NULL;
BufferDesc *bufHdr = GetBufferDescriptor(i);
@@ -5339,6 +5396,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
else
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
pfree(srels);
}
@@ -5537,7 +5595,8 @@ FlushDatabaseBuffers(Oid dbid)
int i;
BufferDesc *bufHdr;
- for (i = 0; i < NBuffers; i++)
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+ for (i = 0; i < localNBuffers; i++)
{
uint64 buf_state;
@@ -5565,6 +5624,7 @@ FlushDatabaseBuffers(Oid dbid)
else
UnlockBufHdr(bufHdr);
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/*
@@ -7991,11 +8051,13 @@ void
EvictAllUnpinnedBuffers(int32 *buffers_evicted, int32 *buffers_flushed,
int32 *buffers_skipped)
{
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+
*buffers_evicted = 0;
*buffers_skipped = 0;
*buffers_flushed = 0;
- for (int buf = 1; buf <= NBuffers; buf++)
+ for (int buf = 1; buf <= localNBuffers; buf++)
{
BufferDesc *desc = GetBufferDescriptor(buf - 1);
uint64 buf_state;
@@ -8020,6 +8082,7 @@ EvictAllUnpinnedBuffers(int32 *buffers_evicted, int32 *buffers_flushed,
if (buffer_flushed)
(*buffers_flushed)++;
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/*
@@ -8041,13 +8104,15 @@ void
EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted,
int32 *buffers_flushed, int32 *buffers_skipped)
{
+ BEGIN_NBUFFERS_ACCESS(localNBuffers);
+
Assert(!RelationUsesLocalBuffers(rel));
*buffers_skipped = 0;
*buffers_evicted = 0;
*buffers_flushed = 0;
- for (int buf = 1; buf <= NBuffers; buf++)
+ for (int buf = 1; buf <= localNBuffers; buf++)
{
BufferDesc *desc = GetBufferDescriptor(buf - 1);
uint64 buf_state = pg_atomic_read_u64(&(desc->state));
@@ -8082,6 +8147,7 @@ EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted,
if (buffer_flushed)
(*buffers_flushed)++;
}
+ END_NBUFFERS_ACCESS(localNBuffers);
}
/*
@@ -8965,3 +9031,87 @@ const PgAioHandleCallbacks aio_local_buffer_readv_cb = {
.complete_local = local_buffer_readv_complete,
.report = buffer_readv_report,
};
+
+/*
+ * When shrinking the shared buffer pool, evict every buffer in the
+ * range [lowNBuffers, highNBuffers).
+ *
+ * Returns true once every buffer in the range is empty. Returns false if
+ * any buffer was still pinned.
+ */
+bool
+EvictExtraBuffers(int lowNBuffers, int highNBuffers)
+{
+ bool result = true;
+
+ Assert(lowNBuffers < highNBuffers);
+
+ /*
+ * If the buffer being evicted is locked, this function will need to
+ * wait. This function should not be called from a Postmaster since it can
+ * not wait on a lock.
+ */
+ Assert(IsUnderPostmaster);
+
+ for (int buf_id = lowNBuffers; buf_id < highNBuffers; buf_id++)
+ {
+ BufferDesc *desc = GetBufferDescriptor(buf_id);
+ uint32 buf_state;
+ bool buffer_flushed;
+
+ /* Make sure we can pin the buffer (PinBuffer_Locked contract). */
+ ResourceOwnerEnlarge(CurrentResourceOwner);
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(desc);
+
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ result = false;
+ continue;
+ }
+
+ if (!(buf_state & BM_VALID))
+ {
+ /*
+ * Buffer is not valid, but it might still have a BufTable
+ * entry: a previous read IO may have failed, or the backend
+ * that started allocating this slot was cancelled before the
+ * read completed (e.g. autovacuum cancellation). In that case
+ * the descriptor is left with BM_TAG_VALID set, refcount=0,
+ * and the hash table still mapping tag -> buf_id.
+ *
+ * If we leave that BufTable entry behind, a later expand that
+ * re-initializes this slot (clearing tag to InvalidBlockNumber
+ * and BM_TAG_VALID) will desynchronize BufTable from the
+ * descriptor, and the next reader of the original block will
+ * fail the BufferGetBlockNumber assertion in
+ * CheckReadBuffersOperation. Drop the stale entry now.
+ */
+ if (buf_state & BM_TAG_VALID)
+ {
+ PinBuffer_Locked(desc);
+ if (!InvalidateVictimBuffer(desc))
+ {
+ /* Lost a race with another pinner; retry later. */
+ result = false;
+ }
+ UnpinBuffer(desc);
+ }
+ else
+ {
+ UnlockBufHdr(desc, buf_state);
+ }
+ continue;
+ }
+
+ if (!EvictUnpinnedBufferInternal(desc, &buffer_flushed))
+ {
+ elog(WARNING, "could not evict buffer %d, it is pinned", buf_id);
+ result = false;
+ }
+ }
+
+ return result;
+}
diff --git a/src/backend/storage/buffer/dynamic_shared_buffers.c b/src/backend/storage/buffer/dynamic_shared_buffers.c
new file mode 100644
index 00000000000..efc535d365c
--- /dev/null
+++ b/src/backend/storage/buffer/dynamic_shared_buffers.c
@@ -0,0 +1,125 @@
+/*-------------------------------------------------------------------------
+ *
+ * dynamic_shared_buffers.c
+ * Coordination state and helpers for resizing shared_buffers at runtime.
+ *
+ * The dynamic shared buffer (DSB) machinery lets shared_buffers grow and
+ * shrink while the cluster is running.
+ *
+ * See pgxn/neon/README.md ("Dynamic shared buffer") for the full design and
+ * the resize protocol.
+ *
+ * IDENTIFICATION
+ * src/backend/storage/buffer/dynamic_shared_buffers.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "miscadmin.h"
+#include "storage/dynamic_shared_buffers.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "storage/procsignal.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+
+DynamicSharedBuffersControl *DSBCtrl = NULL;
+
+/*
+ * AcquireNBuffersLock
+ *
+ * Take AccessNBuffersLock in shared mode and return the current high water
+ * mark. Callers iterate up to that bound. The lock keeps the resize
+ * coordinator (which acquires the lock exclusively) waiting until we drop it.
+ */
+int
+AcquireNBuffersLock(void)
+{
+ if (!enable_dynamic_shared_buffers)
+ return GetHighNBuffers();
+
+ if (DSBCtrl != NULL)
+ LWLockAcquire(&DSBCtrl->AccessNBuffersLock, LW_SHARED);
+ return GetHighNBuffers();
+}
+
+void
+ReleaseNBuffersLock(void)
+{
+ if (!enable_dynamic_shared_buffers)
+ return;
+ if (DSBCtrl != NULL)
+ LWLockRelease(&DSBCtrl->AccessNBuffersLock);
+}
+
+/*
+ * Try to claim coordinator status for a buffer-pool resize.
+ *
+ * Returns true if we are now the coordinator, or false if another backend
+ * is performing a resize.
+ */
+bool
+ClaimResizeCoordinator(void)
+{
+ bool claimed = false;
+
+ Assert(DSBCtrl != NULL);
+
+ SpinLockAcquire(&DSBCtrl->coordinator_lock);
+ if (!DSBCtrl->resize_in_progress)
+ {
+ DSBCtrl->resize_in_progress = true;
+ DSBCtrl->coordinator_pid = MyProcPid;
+ claimed = true;
+ }
+ SpinLockRelease(&DSBCtrl->coordinator_lock);
+
+ return claimed;
+}
+
+/*
+ * Release the coordinator slot acquired by ClaimResizeCoordinator().
+ */
+void
+ReleaseResizeCoordinator(void)
+{
+ Assert(DSBCtrl != NULL);
+
+ SpinLockAcquire(&DSBCtrl->coordinator_lock);
+ Assert(DSBCtrl->resize_in_progress);
+ Assert(DSBCtrl->coordinator_pid == MyProcPid);
+ DSBCtrl->resize_in_progress = false;
+ DSBCtrl->coordinator_pid = InvalidPid;
+ SpinLockRelease(&DSBCtrl->coordinator_lock);
+}
+
+/*
+ * DSBControlInit
+ *
+ * Allocate and initialize the DynamicSharedBuffersControl structure in shared
+ * memory. Must be called before BufferManagerShmemInit so that DSBCtrl is
+ * available when the buffer pool is set up.
+ */
+void
+DSBControlInit(void)
+{
+ bool foundDSBCtrl;
+
+ DSBCtrl = (DynamicSharedBuffersControl *)
+ ShmemInitStruct("DSB Control", sizeof(DynamicSharedBuffersControl),
+ &foundDSBCtrl);
+
+ if (!foundDSBCtrl)
+ {
+ pg_atomic_init_u32(&DSBCtrl->lowNBuffers, NBuffersGUC);
+ pg_atomic_init_u32(&DSBCtrl->highNBuffers, NBuffersGUC);
+
+ SpinLockInit(&DSBCtrl->coordinator_lock);
+ DSBCtrl->resize_in_progress = false;
+ DSBCtrl->coordinator_pid = InvalidPid;
+
+ LWLockInitialize(&DSBCtrl->AccessNBuffersLock,
+ LWTRANCHE_ACCESS_NBUFFERS);
+ }
+}
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index fdb5bad7910..ee2eb520595 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -37,7 +37,7 @@ typedef struct
/*
* clock-sweep hand: index of next buffer to consider grabbing. Note that
* this isn't a concrete buffer - we only ever increase the value. So, to
- * get an actual buffer, it needs to be used modulo NBuffers.
+ * get an actual buffer, it needs to be used modulo lowNBuffers.
*/
pg_atomic_uint32 nextVictimBuffer;
@@ -110,6 +110,7 @@ static inline uint32
ClockSweepTick(void)
{
uint32 victim;
+ int lowNBuffers;
/*
* Atomically move hand ahead one buffer - if there's several processes
@@ -118,13 +119,14 @@ ClockSweepTick(void)
*/
victim =
pg_atomic_fetch_add_u32(&StrategyControl->nextVictimBuffer, 1);
+ lowNBuffers = GetLowNBuffers();
- if (victim >= NBuffers)
+ if (victim >= lowNBuffers)
{
uint32 originalVictim = victim;
/* always wrap what we look up in BufferDescriptors */
- victim = victim % NBuffers;
+ victim = victim % lowNBuffers;
/*
* If we're the one that just caused a wraparound, force
@@ -152,7 +154,7 @@ ClockSweepTick(void)
*/
SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
- wrapped = expected % NBuffers;
+ wrapped = expected % GetLowNBuffers();
success = pg_atomic_compare_exchange_u32(&StrategyControl->nextVictimBuffer,
&expected, wrapped);
@@ -237,7 +239,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint64 *buf_state, bool *from_r
pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
/* Use the "clock sweep" algorithm to find a free buffer */
- trycounter = NBuffers;
+ trycounter = GetLowNBuffers();
for (;;)
{
uint64 old_buf_state;
@@ -290,7 +292,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint64 *buf_state, bool *from_r
if (pg_atomic_compare_exchange_u64(&buf->state, &old_buf_state,
local_buf_state))
{
- trycounter = NBuffers;
+ trycounter = GetLowNBuffers();
break;
}
}
@@ -328,14 +330,17 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint64 *buf_state, bool *from_r
* being read.
*/
int
-StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
+StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc,
+ int *low_nbuffers)
{
uint32 nextVictimBuffer;
int result;
+ int lowNBuffers;
SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
nextVictimBuffer = pg_atomic_read_u32(&StrategyControl->nextVictimBuffer);
- result = nextVictimBuffer % NBuffers;
+ lowNBuffers = GetLowNBuffers();
+ result = nextVictimBuffer % lowNBuffers;
if (complete_passes)
{
@@ -345,13 +350,15 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
* Additionally add the number of wraparounds that happened before
* completePasses could be incremented. C.f. ClockSweepTick().
*/
- *complete_passes += nextVictimBuffer / NBuffers;
+ *complete_passes += nextVictimBuffer / lowNBuffers;
}
if (num_buf_alloc)
{
*num_buf_alloc = pg_atomic_exchange_u32(&StrategyControl->numBufferAllocs, 0);
}
+ if (low_nbuffers)
+ *low_nbuffers = lowNBuffers;
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
return result;
}
@@ -522,10 +529,15 @@ GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size_kb)
if (ring_buffers == 0)
return NULL;
- /* Cap to 1/8th of shared_buffers */
- ring_buffers = Min(NBuffers / 8, ring_buffers);
+ /*
+ * Cap to 1/8th of shared_buffers. Using GetLowNBuffers() here is fine even
+ * though it is a non-critical sizing decision: the strategy survives a
+ * resize because the ring size is fixed once the strategy is created.
+ */
+ ring_buffers = Min(GetLowNBuffers() / 8, ring_buffers);
- /* NBuffers should never be less than 16, so this shouldn't happen */
+ /* shared_buffers should never be less than MIN_SHARED_BUFFERS,
+ * so this shouldn't happen */
Assert(ring_buffers > 0);
/* Allocate the object and initialize all elements to zeroes */
@@ -574,7 +586,7 @@ int
GetAccessStrategyPinLimit(BufferAccessStrategy strategy)
{
if (strategy == NULL)
- return NBuffers;
+ return GetLowNBuffers();
switch (strategy->btype)
{
@@ -768,3 +780,36 @@ StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf, bool from_r
return true;
}
+
+/*
+ * StrategyReset -- reset the clock-sweep cursor for a buffer pool resize.
+ *
+ * Called by pg_resize_shared_buffers() at two distinct points:
+ *
+ * - Just before publishing a lower lowNBuffers (shrink). The existing
+ * cursor may already point above new_size; resetting to 0 makes the
+ * next clock sweep start from the bottom of the surviving range and
+ * avoids ClockSweepTick() wrapping past the new buffers via modulo
+ * arithmetic right as the bound moves.
+ *
+ * - At the end of an expand, after the new descriptors are initialized,
+ * to point the cursor at the start of the freshly added range so the
+ * next sweep tries the empty buffers before re-scanning existing ones
+ * with usage_count == 0.
+ */
+void
+StrategyReset(int old_size, int new_size)
+{
+ SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
+ if (new_size > old_size)
+ {
+ /* expand: point cursor at start of new range */
+ pg_atomic_write_u32(&StrategyControl->nextVictimBuffer, old_size);
+ }
+ else
+ {
+ /* shrink: rewind cursor to the bottom of the surviving range */
+ pg_atomic_write_u32(&StrategyControl->nextVictimBuffer, 0);
+ }
+ SpinLockRelease(&StrategyControl->buffer_strategy_lock);
+}
diff --git a/src/backend/storage/buffer/meson.build b/src/backend/storage/buffer/meson.build
index ed84bf08971..845a26b8db3 100644
--- a/src/backend/storage/buffer/meson.build
+++ b/src/backend/storage/buffer/meson.build
@@ -2,8 +2,10 @@
backend_sources += files(
'buf_init.c',
+ 'buf_resize.c',
'buf_table.c',
'bufmgr.c',
+ 'dynamic_shared_buffers.c',
'freelist.c',
'localbuf.c',
)
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 264e4c22ca6..b8045c02997 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -223,6 +223,18 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
on_shmem_exit(CleanupProcSignalState, (Datum) 0);
}
+/*
+ * IsProcSignalInitialized
+ * Return true if this process has registered itself with the
+ * ProcSignal subsystem (via ProcSignalInit) and not yet released its
+ * slot in CleanupProcSignalState.
+ */
+bool
+IsProcSignalInitialized(void)
+{
+ return MyProcSignalSlot != NULL;
+}
+
/*
* CleanupProcSignalState
* Remove current process from ProcSignal mechanism
@@ -590,6 +602,11 @@ ProcessProcSignalBarrier(void)
case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
processed = AbsorbDataChecksumsBarrier(type);
break;
+ case PROCSIGNAL_BARRIER_SHBUF_RESIZE:
+ /* Just acknowledge; the resize coordinator only needs
+ * confirmation that all backends have observed the
+ * updated lowNBuffers. */
+ break;
}
/*
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 560659f9568..ada14fc2a67 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -417,6 +417,7 @@ XactSLRU "Waiting to access the transaction status SLRU cache."
ParallelVacuumDSA "Waiting for parallel vacuum dynamic shared memory allocation."
AioUringCompletion "Waiting for another process to complete IO via io_uring."
ShmemIndex "Waiting to find or allocate space in shared memory."
+AccessNBuffers "Waiting to access the current shared buffer count during dynamic shared buffer resize."
# No "ABI_compatibility" region here as WaitEventLWLock has its own C code.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index bbd28d14d99..97cfd41d864 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -141,7 +141,9 @@ int max_parallel_maintenance_workers = 2;
* MaxBackends is computed by PostmasterMain after modules have had a chance to
* register background workers.
*/
-int NBuffers = 16384;
+int NBuffersGUC = 16384;
+bool enable_dynamic_shared_buffers = false;
+int MaxNBuffers = 0;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 774bbc9be5f..e5ed335ce2d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -41,6 +41,7 @@
#include "miscadmin.h"
#include "parser/scansup.h"
#include "port/pg_bitutils.h"
+#include "storage/dynamic_shared_buffers.h"
#include "storage/fd.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
@@ -5391,6 +5392,13 @@ ShowGUCOption(const struct config_generic *record, bool use_units)
{
const struct config_int *conf = &record->_int;
+ /*
+ * Set NBuffersGUC here so that both SHOW shared_buffers (use_units==true)
+ * and pg_settings (use_units==false) reflect the current shared buffer pool size.
+ */
+ if (conf->variable == &NBuffersGUC)
+ NBuffersGUC = GetLowNBuffers();
+
if (conf->show_hook)
val = conf->show_hook();
else
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index afaa058b046..15d0ed35c5b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2704,12 +2704,28 @@
{ name => 'shared_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
short_desc => 'Sets the number of shared memory buffers used by the server.',
flags => 'GUC_UNIT_BLOCKS',
- variable => 'NBuffers',
+ variable => 'NBuffersGUC',
boot_val => '16384',
- min => '16',
+ min => 'MIN_SHARED_BUFFERS',
+ max => 'INT_MAX / 2',
+},
+
+{ name => 'max_shared_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
+ short_desc => 'Sets the upper limit for the shared_buffers value.',
+ long_desc => 'If set above zero, it must be at least shared_buffers.',
+ flags => 'GUC_UNIT_BLOCKS',
+ variable => 'MaxNBuffers',
+ boot_val => '0',
+ min => '0',
max => 'INT_MAX / 2',
},
+{ name => 'enable_dynamic_shared_buffers', type => 'bool', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
+ short_desc => 'Enables dynamic resizing of the shared buffer pool.',
+ variable => 'enable_dynamic_shared_buffers',
+ boot_val => 'false',
+},
+
{ name => 'shared_memory_size', type => 'int', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
short_desc => 'Shows the size of the server\'s main shared memory area (rounded up to the nearest MB).',
flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_UNIT_MB | GUC_RUNTIME_COMPUTED',
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index be157a5fbe9..904470a3f34 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12693,4 +12693,13 @@
proname => 'hashoid8extended', prorettype => 'int8',
proargtypes => 'oid8 int8', prosrc => 'hashoid8extended' },
+# Online shared buffer pool resizing (see src/backend/storage/buffer/buf_resize.c)
+{ oid => '6500', descr => 'resize the shared buffer pool to a new size',
+ proname => 'pg_resize_shared_buffers', provolatile => 'v', proretset => 't',
+ prorettype => 'record', proargtypes => 'text',
+ proallargtypes => '{text,text,float8,text}',
+ proargmodes => '{i,o,o,o}',
+ proargnames => '{new_size,key,value,unit}',
+ prosrc => 'pg_resize_shared_buffers' },
+
]
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 8ccdf61246b..e93eaaf6b1c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -175,12 +175,28 @@ extern PGDLLIMPORT bool ExitOnAnyError;
extern PGDLLIMPORT char *DataDir;
extern PGDLLIMPORT int data_directory_mode;
-extern PGDLLIMPORT int NBuffers;
+extern PGDLLIMPORT int NBuffersGUC;
+extern PGDLLIMPORT int MaxNBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int autovacuum_max_parallel_workers;
+extern PGDLLIMPORT bool enable_dynamic_shared_buffers;
+
+/*
+ * GetMaxNBuffers
+ *
+ * Before DSB is introduced, PG does not recognize max_shared_buffers GUC.
+ * When max_shared_buffers is not set, it is resolved to NBuffersGUC.
+ */
+static inline int
+GetMaxNBuffers(void)
+{
+ if (enable_dynamic_shared_buffers)
+ return MaxNBuffers;
+ return NBuffersGUC;
+}
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 89615a254a3..1c5989c2472 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -584,9 +584,12 @@ extern BufferDesc *StrategyGetBuffer(BufferAccessStrategy strategy,
extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
BufferDesc *buf, bool from_ring);
-extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
+extern int StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc,
+ int *low_nbuffers);
extern void StrategyNotifyBgWriter(int bgwprocno);
+extern void StrategyReset(int old_size, int new_size);
+
/* buf_table.c */
extern uint32 BufTableHashCode(BufferTag *tagPtr);
extern int BufTableLookup(BufferTag *tagPtr, uint32 hashcode);
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 6837b35fc6d..3dbaf364133 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,11 +14,13 @@
#ifndef BUFMGR_H
#define BUFMGR_H
+#include "miscadmin.h"
#include "port/pg_iovec.h"
#include "storage/aio_types.h"
#include "storage/block.h"
#include "storage/buf.h"
#include "storage/bufpage.h"
+#include "storage/dynamic_shared_buffers.h"
#include "storage/relfilelocator.h"
#include "utils/relcache.h"
#include "utils/snapmgr.h"
@@ -159,7 +161,7 @@ typedef struct ReadBuffersOperation ReadBuffersOperation;
typedef struct WritebackContext WritebackContext;
/* in globals.c ... this duplicates miscadmin.h */
-extern PGDLLIMPORT int NBuffers;
+extern PGDLLIMPORT int NBuffersGUC;
/* in bufmgr.c */
extern PGDLLIMPORT bool zero_damaged_pages;
@@ -371,6 +373,13 @@ extern void MarkDirtyAllUnpinnedBuffers(int32 *buffers_dirtied,
int32 *buffers_already_dirty,
int32 *buffers_skipped);
+extern Size BufferManagerShmemExpand(int lowNBuffers, int highNBuffers, bool *success);
+extern void BufferManagerShmemInitBuffers(int lowNBuffers, int highNBuffers);
+extern Size BufferManagerShmemShrink(int lowNBuffers, int highNBuffers, bool *success);
+
+/* in bufmgr.c */
+extern bool EvictExtraBuffers(int lowNBuffers, int highNBuffers);
+
/* in localbuf.c */
extern void AtProcExit_LocalBuffers(void);
@@ -418,7 +427,7 @@ extern void FreeAccessStrategy(BufferAccessStrategy strategy);
static inline bool
BufferIsValid(Buffer bufnum)
{
- Assert(bufnum <= NBuffers);
+ Assert(bufnum <= (Buffer) GetMaxNBuffers());
Assert(bufnum >= -NLocBuffer);
return bufnum != InvalidBuffer;
diff --git a/src/include/storage/dynamic_shared_buffers.h b/src/include/storage/dynamic_shared_buffers.h
new file mode 100644
index 00000000000..9d57b25e15d
--- /dev/null
+++ b/src/include/storage/dynamic_shared_buffers.h
@@ -0,0 +1,103 @@
+/*-------------------------------------------------------------------------
+ *
+ * dynamic_shared_buffers.h
+ * Dynamic shared buffer (DSB) coordination state and helpers.
+ *
+ * This header collects the neon-specific machinery that lets shared_buffers
+ * grow and shrink at runtime.
+ *
+ * See pgxn/neon/README.md ("Dynamic shared buffer") for the full design and
+ * the resize protocol.
+ *
+ * src/include/storage/dynamic_shared_buffers.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DYNAMIC_SHARED_BUFFERS_H
+#define DYNAMIC_SHARED_BUFFERS_H
+
+#include "port/atomics.h"
+#include "storage/lwlock.h"
+#include "storage/spin.h"
+
+/*
+ * Minimum allowed value of the shared_buffers GUC, and the smallest size that
+ * pg_resize_shared_buffers() can shrink to.
+ */
+#define MIN_SHARED_BUFFERS 16
+
+/*
+ * DynamicSharedBuffersControl is shared between backends and helps to
+ * coordinate shared buffer pool resize. See pgxn/neon/README.md
+ * ("Dynamic shared buffer") for the full design and protocol.
+ */
+typedef struct
+{
+ /*
+ * When a resize is in progress, `resize_in_progress` is set and
+ * `coordinator_pid` is the PID of the backend performing the resize.
+ */
+ bool resize_in_progress;
+ pid_t coordinator_pid;
+ slock_t coordinator_lock; /* protects the two fields above */
+
+ pg_atomic_uint32 lowNBuffers; /* low water mark: backends allocate
+ * buffers from [0, lowNBuffers). */
+ pg_atomic_uint32 highNBuffers; /* high water mark: buffer descriptor
+ * memory in [0, highNBuffers) is
+ * allocated and initialized. */
+ LWLock AccessNBuffersLock; /* Backends hold this in shared mode while
+ * iterating the buffer pool up to
+ * highNBuffers; the resize coordinator
+ * acquires it exclusively to mutate the
+ * buffer pool memory and publish a new
+ * highNBuffers. */
+} DynamicSharedBuffersControl;
+
+extern PGDLLIMPORT DynamicSharedBuffersControl *DSBCtrl;
+
+extern PGDLLIMPORT int NBuffersGUC;
+
+extern bool IsProcSignalInitialized(void);
+
+/*
+ * GetHighNBuffers returns the high water mark.
+ */
+static inline int
+GetHighNBuffers(void)
+{
+ if (DSBCtrl == NULL)
+ return NBuffersGUC;
+ return pg_atomic_read_u32(&DSBCtrl->highNBuffers);
+}
+
+/*
+ * GetLowNBuffers returns the low water mark.
+ */
+static inline int
+GetLowNBuffers(void)
+{
+ if (DSBCtrl == NULL)
+ return NBuffersGUC;
+ /*
+ * This must only be called from a process that is subscribed to the
+ * SHBUF_RESIZE ProcSignal barrier (i.e. one that has finished
+ * ProcSignalInit) -- otherwise a concurrent
+ * shrink could free the buffer memory in [new_low, old_low) without
+ * waiting for us.
+ */
+ Assert(IsProcSignalInitialized());
+ return pg_atomic_read_u32(&DSBCtrl->lowNBuffers);
+}
+
+extern void DSBControlInit(void);
+
+/*
+ * Try to claim coordinator status for a buffer-pool resize. Returns true if
+ * we became the coordinator (caller must eventually call
+ * ReleaseResizeCoordinator()), false if a resize was already in progress.
+ */
+extern bool ClaimResizeCoordinator(void);
+extern void ReleaseResizeCoordinator(void);
+
+#endif /* DYNAMIC_SHARED_BUFFERS_H */
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index b205b00e7a1..728bedd1cd8 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -64,6 +64,47 @@ typedef void (*shmem_startup_hook_type) (void);
/* ipc.c */
extern PGDLLIMPORT bool proc_exit_inprogress;
extern PGDLLIMPORT bool shmem_exit_inprogress;
+extern int AcquireNBuffersLock(void);
+extern void ReleaseNBuffersLock(void);
+
+/*----------
+ * BEGIN_NBUFFERS_ACCESS / END_NBUFFERS_ACCESS
+ *
+ * The lock is released at scope exit via __attribute__((cleanup)), so:
+ * - early `return`, `break`, or `goto` between BEGIN and END does NOT leak.
+ * - END_NBUFFERS_ACCESS(name) is idempotent: it releases the lock and sets
+ * a sentinel so the cleanup at scope exit skips a second release. Use it
+ * when you want to drop the lock before the enclosing block ends.
+ * - ereport(ERROR) bypasses the cleanup attribute, but LWLockReleaseAll()
+ * during transaction abort still releases AccessNBuffersLock, so no leak.
+ *
+ * Usage:
+ * BEGIN_NBUFFERS_ACCESS(localNBuffers);
+ * for (int i = 0; i < localNBuffers; i++)
+ * ... use buffer i ...
+ * END_NBUFFERS_ACCESS(localNBuffers);
+ *
+ *----------
+ */
+static inline void
+nbuffers_lock_auto_release(const bool *released)
+{
+ if (!*released)
+ ReleaseNBuffersLock();
+}
+
+#define BEGIN_NBUFFERS_ACCESS(name) \
+ bool name##_released __attribute__((cleanup(nbuffers_lock_auto_release))) = false; \
+ int name = AcquireNBuffersLock()
+
+#define END_NBUFFERS_ACCESS(name) \
+ do { \
+ if (!name##_released) \
+ { \
+ ReleaseNBuffersLock(); \
+ name##_released = true; \
+ } \
+ } while (0)
pg_noreturn extern void proc_exit(int code);
extern void shmem_exit(int code);
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index d7eb648bd27..6c9b47bc368 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -140,3 +140,4 @@ PG_LWLOCKTRANCHE(XACT_SLRU, XactSLRU)
PG_LWLOCKTRANCHE(PARALLEL_VACUUM_DSA, ParallelVacuumDSA)
PG_LWLOCKTRANCHE(AIO_URING_COMPLETION, AioUringCompletion)
PG_LWLOCKTRANCHE(SHMEM_INDEX, ShmemIndex)
+PG_LWLOCKTRANCHE(ACCESS_NBUFFERS, AccessNBuffers)
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index aaa158bfd66..e496c0bca7c 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,7 @@ typedef enum
PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
PROCSIGNAL_BARRIER_CHECKSUM_ON,
+ PROCSIGNAL_BARRIER_SHBUF_RESIZE, /* shared buffer resize barrier */
} ProcSignalBarrierType;
/*
@@ -70,6 +71,7 @@ typedef enum
* prototypes for functions in procsignal.c
*/
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
+extern bool IsProcSignalInitialized(void);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
ProcNumber procNumber);
extern void SendCancelRequest(int backendPID, const uint8 *cancel_key, int cancel_key_len);
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 132b56a5864..a748e7bb626 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -152,7 +152,7 @@ select count(*) = 0 as ok from pg_stat_recovery;
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
-select name, setting from pg_settings where name like 'enable%';
+select name, setting from pg_settings where name like 'enable%' and name <> 'enable_dynamic_shared_buffers';
name | setting
--------------------------------+---------
enable_async_append | on
diff --git a/src/test/regress/sql/sysviews.sql b/src/test/regress/sql/sysviews.sql
index 507e400ad4a..5366f8649a3 100644
--- a/src/test/regress/sql/sysviews.sql
+++ b/src/test/regress/sql/sysviews.sql
@@ -81,7 +81,7 @@ select count(*) = 0 as ok from pg_stat_recovery;
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
-select name, setting from pg_settings where name like 'enable%';
+select name, setting from pg_settings where name like 'enable%' and name <> 'enable_dynamic_shared_buffers';
-- There are always wait event descriptions for various types. InjectionPoint
-- may be present or absent, depending on history since last postmaster start.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8cf40c87043..140bed5bcfd 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -715,6 +715,7 @@ DumpableObject
DumpableObjectType
DumpableObjectWithAcl
DynamicFileList
+DynamicSharedBuffersControl
DynamicZoneAbbrev
ECDerivesEntry
ECDerivesKey
base-commit: 0392fb900eb89f52988cccd33046443c39c70d1c
--
2.54.0
[text/markdown] README-dsb.md (7.9K, 4-README-dsb.md)
download | inline:
### Buffer count variables
The pool size is tracked with two atomic water marks in `DynamicSharedBuffersControl`:
- **`lowNBuffers`** — the lower bound. Backends only allocate buffers from `[0, lowNBuffers)`.
- **`highNBuffers`** — the upper bound. Buffer descriptor memory in
`[0, highNBuffers)` is allocated and initialized.
We also maintain three variables used at postmaster startup only.
- **`NBuffers`** — Deprecated and removed.
- **`NBuffersGUC`** — Backs the `shared_buffers` GUC. It captures the
initial pool size at postmaster startup. `SHOW shared_buffers` shows `lowNBuffers` value.
- **`MaxNBuffers`** — Backs the `max_shared_buffers` GUC and it's the upper limit of **`highNBuffers`**.
Invariants:
- `MIN_SHARED_BUFFERS (= 16) <= lowNBuffers <= highNBuffers <= MaxNBuffers`.
- In steady state (no resize in progress), `lowNBuffers == highNBuffers`.
- `lowNBuffers < highNBuffers` only while a shrink is in flight.
##### Steady state at 4 CU
| Variable | Value |
| ---------------- | ---------------------------------------- |
| `MaxNBuffers` | ~5M (~41 GB, sized for 8 CU) |
| `lowNBuffers` | ~3M |
| `highNBuffers` | ~3M |
### When to use which
- Buffer allocation code (clock-sweep, freelist, ring-buffer sizing, etc.) should call `GetLowNBuffers()`, which returns `lowNBuffers`.
- Functions that visit the buffer array (e.g. `DropRelationBuffers`) must call `BEGIN_NBUFFERS_ACCESS(localNBuffers)` and `END_NBUFFERS_ACCESS(localNBuffers)` when it completes visiting the buffer array. `BEGIN_NBUFFERS_ACCESS` records `localNBuffers = highNBuffers` while holding `AccessNBuffersLock` in shared mode, which is what makes the snapshot safe to dereference for the duration of the access.
- `GetHighNBuffers()` is appropriate only for **non-critical sizing decisions** that can tolerate a stale value (e.g. picking the ring-buffer size for a new strategy). It must **never** be used to bound a loop that reads buffer descriptors or buffer blocks: by the time you indexed into the array, the resize coordinator may have unmapped the underlying pages. Use `BEGIN_NBUFFERS_ACCESS`/`END_NBUFFERS_ACCESS` for that.
### Triggering a resize
A resize operation is a single function call:
```sql
SELECT pg_resize_shared_buffers('<new_size>');
```
### Shrink
The initial state of the memory region is:
```
lowNBuffers
0 new_size highNBuffers MaxNBuffers
|--------------------|---------------------|--------------------------|
ALLOCATED ALLOCATED RESERVED
```
Shrink performs these steps:
1. Reset the clock-sweep cursor, then publish `lowNBuffers := new_size`.
2. Barrier: wait for all backends to acknowledge the new `lowNBuffers`.
3. Purge any freelist entries above `lowNBuffers`.
4. Evict buffers in the `[lowNBuffers, highNBuffers)` range.
5. Acquire `AccessNBuffersLock` exclusively, set `highNBuffers = lowNBuffers`, release the lock.
6. Free physical memory in `[lowNBuffers, old_highNBuffers)`[^1].
After step-1, the memory region becomes:
```
0 lowNBuffers highNBuffers MaxNBuffers
|--------------------|---------------------|--------------------------|
ALLOCATED TO EVICT RESERVED
```
- `[0, lowNBuffers)`: Backends allocate buffers in this range.
- `[lowNBuffers, highNBuffers)`: The coordinator will evict buffers in this range.
- `[highNBuffers, MaxNBuffers)`: The reserved range and should not be used.
When shrink completes,
```
lowNBuffers
0 highNBuffers MaxNBuffers
|--------------------|------------------------------------------------|
ALLOCATED RESERVED
```
[^1]: Huge pages significantly speeds up freeing memory. It takes less than a second to free 32 GB memory.
### Expand
The initial state of the memory region is:
```
lowNBuffers
0 highNBuffers new_size MaxNBuffers
|--------------------|---------------------|--------------------------|
ALLOCATED TO ALLOCATE RESERVED
```
Expand performs these steps:
1. Allocate physical memory in `[lowNBuffers, new_size)` and initialize the new buffer descriptors.
2. Acquire `AccessNBuffersLock` exclusively.
3. Reset the clock-sweep cursor to point at the start of the new range, so the next clock sweep tries the freshly added empty buffers.
4. Publish `highNBuffers := new_size` then `lowNBuffers := new_size`.
5. Release the lock. Backends taking the lock in shared mode now see the fully-grown pool; concurrent atomics readers may briefly see `lowNBuffers < highNBuffers` between the two writes above, which is harmless since both bounds already cover initialized memory.
When expand completes,
```
lowNBuffers
0 highNBuffers MaxNBuffers
|------------------------------------------|--------------------------|
ALLOCATED RESERVED
```
### Background-writer interaction
`BgBufferSync` keeps a local `static int saved_low_nbuffers` snapshot
and compares it against `GetLowNBuffers()` on every invocation. Whenever the
value differs, a resize has happened: the smoothed allocation rate /
clean-buffer density it was tracking are no longer meaningful, so it
invalidates `saved_info_valid` and starts fresh.
### Coordinating with backends that visit the buffer array
Special coordination must be done with backends that scan buffers based on
the upper bound (`highNBuffers`), e.g., `DropRelationBuffers`.
Otherwise, a backend visiting a buffer in `[lowNBuffers, highNBuffers]` will hit SEGV when a shrink operation frees the memory in `[lowNBuffers, highNBuffers]` range.
Coordination is done via `BEGIN_NBUFFERS_ACCESS(localNBuffers);` and
`END_NBUFFERS_ACCESS(localNBuffers)`. A backend calls `BEGIN_NBUFFERS_ACCESS()`
before visiting the buffer array and `END_NBUFFERS_ACCESS()` afterwards.
`BEGIN_NBUFFERS_ACCESS` acquires `AccessNBuffersLock` in shared mode for
the duration of visiting the buffer array. The resize coordinator acquires
`AccessNBuffersLock` in exclusive mode around the operations that mutate the
buffer pool memory and publish a new `highNBuffers` (free during shrink /
allocate during expand). This both waits for any in-flight backend to
complete and blocks new ones from starting with a stale view of
`highNBuffers`.
### Error handling
A failure during resize will NOT bring down postgres.
`pg_resize_shared_buffers()` is interruptible: SIGINT/CTRL-C and
`pg_terminate_backend()` are honoured, and any `ereport(ERROR)` raised from
inside a resize step (e.g. an OOM, a `madvise()` failure) propagates back to
the caller after cleanup runs.
We can rollback shrink when it is interrupted between lowering `lowNBuffers`
and lowering `highNBuffers`. The memory in `[lowNBuffers, highNBuffers)` is
still mapped so the rollback is safe. Buffers that were already evicted in the
partial run come back as empty and are picked up by the clock sweep on the
next allocation attempt; buffers that the partial purge moved off the
freelist come back via the normal `StrategyFreeBuffer()` lifecycle.
An error at `madvise(MADV_REMOVE)` step cannot be rolled back.
By the time we call madvise both water marks are already at `new_size`,
so the buffer pool is self-consistent at the smaller size.
The caller may try expand later.
`madvise(MADV_POPULATE_WRITE)` at expand may fail.
When this happens, `lowNBuffers` and `highNBuffers` remain at
`old_size` and backends keep using the old (smaller) pool. The
partially-touched bytes in the [old_size, new_size) range sit unused in
shmem until a future successful expand re-initializes them. The
coordinator surfaces this to the caller as a hard `ERROR`.
^ permalink raw reply [nested|flat] 75+ messages in thread
* Re: Better shared data structure management and resizable shared data structures
@ 2026-06-15 04:28 Ashutosh Bapat <[email protected]>
parent: Heikki Linnakangas <[email protected]>
1 sibling, 0 replies; 75+ messages in thread
From: Ashutosh Bapat @ 2026-06-15 04:28 UTC (permalink / raw)
To: Heikki Linnakangas <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; pgsql-hackers; [email protected]
On Fri, Jun 12, 2026 at 9:07 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 21/04/2026 19:05, Ashutosh Bapat wrote:
> > On Tue, Apr 21, 2026 at 1:10 PM Heikki Linnakangas <[email protected]> wrote:
> >>
> >> On 07/04/2026 17:19, Ashutosh Bapat wrote:
> >>> Hi Heikki,
> >>> CallShmemCallbacksAfterStartup() holds ShmemIndexLock while invoking
> >>> init_fn/attach_fn callbacks. That looks wrong. Before this commit,
> >>> init or attach code was not run with the lock held. Any reason the
> >>> lock is held while calling init and attach callbacks. Since these
> >>> function can come from extensions, we don't have control on what goes
> >>> in those functions, and thus looks problematic. Further, it will
> >>> serialize all the attach_fn executions across backends, since each
> >>> will be run under the lock.
> >>
> >> This was intentional, I added a note in the docs about it:
> >>
> >> When <function>RegisterShmemCallbacks()</function> is called after
> >> startup, it will immediately call the appropriate callbacks,
> >> depending
> >> on whether the requested memory areas were already initialized by
> >> another backend. The callbacks will be called while holding an
> >> internal
> >> lock, which prevents concurrent two backends from initializing the
> >> memory area concurrently.
> >>
> >> That "internal lock" is ShmemIndexLock. I piggybacked on that since the
> >> code needs to acquire it anyway for the hash table lookups.
> >
> > I had read this part, but didn't realize it's ShmemIndexLock. The
> > document and the code are placed far apart and the comments in the
> > code do not help connecting these two. The comment before
> > LWLockAcquire() call doesn't say anything about init functions.
> > /* Hold ShmemIndexLock while we allocate all the shmem entries */
> >
> >> With the old ShmemInitStruct() interface, extensions needed to do the
> >> locking themselves, usually by holding AddinShmemInitLock.
> >>
> >> (Now that I read that again, the grammar on the last sentence sounds
> >> awkward...)
> >
> > Given that the init_fn is called in only one backend which requests
> > the structures first, do we need a lock?
>
> If two backends request the same structure concurrently, which one is
> "first"? That's what the lock determines.
>
> It's not safe to release the lock before the init callback has finished.
> Otherwise, another backend might attach to the struct before it's fully
> initialized and read uninitialized values.
>
> >>> In my case, the init_fn was performing ShmemIndex lookup which
> >>> deadlocked. It's questionable whether init function should lookup
> >>> ShmemIndex but, it's not something that needs to be prohibited
> >>> either.
> >> Yeah I'm curious what the use case is. We could easily introduce another
> >> lock or reuse AddinShmemInitLock for this.
> >
> > In case of resizable shared memory structures, I was adding mprotect
> > to make sure that the part of the shared address space which is
> > reserved but not used is protected from inadvertent access. The
> > mprotect is wrapped in a shmem API which fetches the ShmemIndex entry
> > of the shared structure, figures out the part of the address space to
> > protect using maximum_size and current size and calls mprotect
> > appropriately. To fetch the ShmemIndex entry it acquires a ShmemIndex
> > lock. The shmem API was supposed to be called from init_fn() and
> > attach_fn() to protect the address spaces as soon as the structure is
> > attached to. See patches attached to [1] for code.
> >
> > [1] https://www.postgresql.org/message-id/[email protected]...
>
> Ok. So if I understand correctly, holding ShmemIndexLock is not a actual
> problem per se, you just didn't expect it. Right?
>
> I propose the attached to improve the wording a little on the docs,
> comments, and error message.
The patch helps to set the expectations right.
ShmemIndexLock is for protecting entries in ShmemIndex; I didn't
expect it to protect the Shared structures as well. I thought a shared
structure specific lock, which usually every shared structure is
expected to have, would protect its initialization and content. But I
see that I was wrong. Even those locks need to be initialized; so they
can't be used here. ShmemIndexLock works here with the proposed
comment changes.
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 75+ messages in thread
end of thread, other threads:[~2026-06-15 04:28 UTC | newest]
Thread overview: 75+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 14:14 Re: Better shared data structure management and resizable shared data structures Ashutosh Bapat <[email protected]>
2026-03-12 17:03 ` Robert Haas <[email protected]>
2026-03-12 18:56 ` Robert Haas <[email protected]>
2026-03-12 19:21 ` Heikki Linnakangas <[email protected]>
2026-03-12 20:05 ` Robert Haas <[email protected]>
2026-03-13 11:41 ` Heikki Linnakangas <[email protected]>
2026-03-16 09:37 ` Ashutosh Bapat <[email protected]>
2026-03-18 19:30 ` Robert Haas <[email protected]>
2026-03-19 10:31 ` Heikki Linnakangas <[email protected]>
2026-03-19 13:36 ` Robert Haas <[email protected]>
2026-03-19 14:34 ` Heikki Linnakangas <[email protected]>
2026-03-19 15:17 ` Robert Haas <[email protected]>
2026-03-24 15:32 ` Ashutosh Bapat <[email protected]>
2026-03-25 16:05 ` Ashutosh Bapat <[email protected]>
2026-03-27 07:01 ` Ashutosh Bapat <[email protected]>
2026-03-27 08:58 ` Ashutosh Bapat <[email protected]>
2026-03-27 23:17 ` Heikki Linnakangas <[email protected]>
2026-03-30 04:50 ` Ashutosh Bapat <[email protected]>
2026-03-30 20:15 ` Heikki Linnakangas <[email protected]>
2026-04-01 11:59 ` Ashutosh Bapat <[email protected]>
2026-04-01 18:17 ` Heikki Linnakangas <[email protected]>
2026-04-02 06:58 ` Ashutosh Bapat <[email protected]>
2026-04-04 00:49 ` Heikki Linnakangas <[email protected]>
2026-04-04 13:51 ` Matthias van de Meent <[email protected]>
2026-04-05 19:35 ` Heikki Linnakangas <[email protected]>
2026-04-04 16:32 ` Ashutosh Bapat <[email protected]>
2026-04-05 15:39 ` Heikki Linnakangas <[email protected]>
2026-04-07 12:24 ` Dagfinn Ilmari Mannsåker <[email protected]>
2026-04-07 13:26 ` Heikki Linnakangas <[email protected]>
2026-04-07 14:19 ` Ashutosh Bapat <[email protected]>
2026-04-21 07:40 ` Heikki Linnakangas <[email protected]>
2026-04-21 16:05 ` Ashutosh Bapat <[email protected]>
2026-06-12 15:37 ` Heikki Linnakangas <[email protected]>
2026-06-12 15:51 ` Haoyu Huang <[email protected]>
2026-06-15 04:28 ` Ashutosh Bapat <[email protected]>
2026-04-02 22:10 ` Matthias van de Meent <[email protected]>
2026-04-03 13:12 ` Ashutosh Bapat <[email protected]>
2026-04-04 00:45 ` Heikki Linnakangas <[email protected]>
2026-04-04 12:00 ` Matthias van de Meent <[email protected]>
2026-04-04 17:32 ` Heikki Linnakangas <[email protected]>
2026-04-04 23:17 ` Matthias van de Meent <[email protected]>
2026-04-05 15:07 ` Heikki Linnakangas <[email protected]>
2026-04-05 19:58 ` Matthias van de Meent <[email protected]>
2026-04-05 15:50 ` Heikki Linnakangas <[email protected]>
2026-04-05 20:06 ` Heikki Linnakangas <[email protected]>
2026-04-05 23:28 ` Heikki Linnakangas <[email protected]>
2026-04-06 13:53 ` Ashutosh Bapat <[email protected]>
2026-04-07 10:06 ` Ashutosh Bapat <[email protected]>
2026-04-07 14:46 ` Ashutosh Bapat <[email protected]>
2026-04-07 19:38 ` Matthias van de Meent <[email protected]>
2026-04-08 03:49 ` Ashutosh Bapat <[email protected]>
2026-04-07 19:48 ` Heikki Linnakangas <[email protected]>
2026-04-07 20:09 ` Andres Freund <[email protected]>
2026-04-08 05:20 ` Ashutosh Bapat <[email protected]>
2026-04-05 05:48 ` Ashutosh Bapat <[email protected]>
2026-04-05 05:58 ` Ashutosh Bapat <[email protected]>
2026-04-05 09:06 ` Matthias van de Meent <[email protected]>
2026-04-05 11:20 ` Ashutosh Bapat <[email protected]>
2026-04-05 14:16 ` Ashutosh Bapat <[email protected]>
2026-04-05 19:05 ` Matthias van de Meent <[email protected]>
2026-04-05 14:08 ` Ashutosh Bapat <[email protected]>
2026-04-05 16:23 ` Ashutosh Bapat <[email protected]>
2026-04-05 18:35 ` Heikki Linnakangas <[email protected]>
2026-04-05 18:58 ` Heikki Linnakangas <[email protected]>
2026-03-30 12:20 ` Ashutosh Bapat <[email protected]>
2026-03-26 10:10 ` Heikki Linnakangas <[email protected]>
2026-03-25 18:37 ` Robert Haas <[email protected]>
2026-03-27 00:51 ` Heikki Linnakangas <[email protected]>
2026-04-05 10:03 ` Heikki Linnakangas <[email protected]>
2026-03-26 18:31 ` Daniel Gustafsson <[email protected]>
2026-04-05 16:13 ` Heikki Linnakangas <[email protected]>
2026-03-13 21:09 ` Heikki Linnakangas <[email protected]>
2026-03-13 23:02 ` Zsolt Parragi <[email protected]>
2026-03-16 10:28 ` Ashutosh Bapat <[email protected]>
2026-03-17 11:58 ` Ashutosh Bapat <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox