public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andrei Lepikhov <[email protected]>
To: Michael Paquier <[email protected]>
To: Danil Anisimow <[email protected]>
Cc: Jeff Davis <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Dean Rasheed <[email protected]>
Cc: Yurii Rashkovskii <[email protected]>
Subject: Re: Comments on Custom RMGRs
Date: Sat, 15 Nov 2025 11:44:15 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<CABm2Ma5Hi9PU_wWP0f=YFu9aDDdeSdX2+JFDL_oDH-QHgexsgg@mail.gmail.com>
	<[email protected]>
	<CABm2Ma6avU5hkk7dd+G2oRNJZJtLOT=Yzeq6fb0v2rf7_YYp7g@mail.gmail.com>
	<[email protected]>
	<CABm2Ma5Ww9M8DgFTCcASnMiPPVYU8cd_wGhHZ5YnoK5im6wT2g@mail.gmail.com>
	<[email protected]>
	<CA+TgmoZxvVXx809WhOcLNbUgAYYiLY+Y-o_gDpay6uTLwaJVyA@mail.gmail.com>
	<[email protected]>
	<CA+TgmoZS++BBdyWxFv6k-UhzX=MGsqd8mu_80Z646-T=jYgW2g@mail.gmail.com>
	<[email protected]>
	<[email protected]>

On 14/10/2025 11:11, Andrei Lepikhov wrote:
> For me, the ideal place for such a hook is CheckPointGuts, right between 
> the CheckPointBuffers call and fsyncs. I think that to demonstrate how 
> this hook can work, the pg_stat_statements storage may need to be 
> redesigned slightly.
There are two patches: 0001, which is the checkpoint hook itself, and 
0002, which includes an example and a trivial test.

During development, I attempted to apply it in my different modules and 
realised that the hook is preferred over an RMGR callback - I don't 
actually want to be forced to register RMGR in each project and have it 
loadable on an instance startup. In lightweight modules, I want to keep 
my knowledge base relatively close to the current state of the instance.
Nevertheless, the plan freezing extension (for example) needs to ensure 
that the user's query plan is 'frozen' after the function call. 
Therefore, we need to store the decision made in the WAL, which requires 
dumping the state into a file before performing the WAL cut. 
Additionally, I'd like to experiment with synchronising an extension 
state between master and replica through WAL records, as most 
optimisation recommendations are relevant to both instances.

Patch 0001 contains a hook that is called once after all checkpoint 
preparations have finished. I recall that people mentioned it might be 
helpful for AMs as well - feel free to propose changes to this patch.

Patch 2 adds an example to the test_dsm_registry module, as it is 
precisely the way I write the code: named DSM segment -> shared HTAB -> 
file dump. So, it looks natural and opens a room to extend this example 
by employing RMGR and xact callbacks to keep the extension state as 
close to the committed changes as possible.

The test looks pretty trivial so far - feel free to propose ideas on how 
to extend it.

-- 
regards, Andrei Lepikhov,
pgEdge
From a0e8d75223fa95dbec1e422eacaef336e45c2008 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <[email protected]>
Date: Thu, 13 Nov 2025 15:00:43 +0100
Subject: [PATCH 1/2] Add a hook for Checkpoint processing.

There are many situations in which a Postgres plugin may need to maintain its
internal state across restarts or crashes. Sometimes it wants to synchronise
its state on logical replicas or be saved in a backup employing custom RMGR and
WAL records.

For statistical extensions, such as pg_stat_statements, it is okay to save
their state on postmaster shutdown. However, business extensions may want
to maintain more actual state, periodically dumping it to a disk file or using
WAL and xact callbacks to be as close as possible to the current
database state.

Checkpoint is a key moment where the DBMS performs disk synchronisation and
cuts the WAL. It is a good time to do the same thing for a plugin, too.
Moreover, the plugin is sure that nothing important will be lost with
the WAL cut.

Discussion: https://www.postgresql.org/message-id/CANbhV-E4pTWeF-DsdaGsOrjJNFWPaR%2BDstjrnkqvf9JFFgOKKQ%40mail.g...
---
 src/backend/access/transam/xlog.c | 15 +++++++++++++++
 src/include/access/xlog.h         |  4 ++++
 src/tools/pgindent/typedefs.list  |  1 +
 3 files changed, 20 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 22d0a2e8c3a..c7c0b226724 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -157,6 +157,13 @@ int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
  */
 int			CheckPointSegments;
 
+/*
+ * Hook for plugins to take control during checkpoint processing. All
+ * preparation procedures have already been done, and only the sync needs
+ * to be done.
+ */
+Checkpoint_hook_type Checkpoint_hook = NULL;
+
 /* Estimated distance between checkpoints, in bytes */
 static double CheckPointDistanceEstimate = 0;
 static double PrevCheckPointDistance = 0;
@@ -7594,6 +7601,14 @@ CheckPointGuts(XLogRecPtr checkPointRedo, int flags)
 	CheckPointPredicate();
 	CheckPointBuffers(flags);
 
+	/*
+	 * Allow a plugin that depends on a custom RMGR to retain its state through
+	 * reboots or crashes by following specific steps, ensuring that essential
+	 * WAL records are not truncated.
+	 */
+	if (Checkpoint_hook)
+		Checkpoint_hook(checkPointRedo, flags);
+
 	/* Perform all queued up fsyncs */
 	TRACE_POSTGRESQL_BUFFER_CHECKPOINT_SYNC_START();
 	CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 605280ed8fb..5c071974557 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -198,6 +198,10 @@ typedef enum WALAvailability
 struct XLogRecData;
 struct XLogReaderState;
 
+/* Hook for plugins to get control at the end of a CheckPoint */
+typedef void (*Checkpoint_hook_type)(XLogRecPtr checkPointRedo, int flags);
+extern PGDLLIMPORT Checkpoint_hook_type Checkpoint_hook;
+
 extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 								   XLogRecPtr fpw_lsn,
 								   uint8 flags,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bce72ae64..6ca05499081 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -413,6 +413,7 @@ CatalogIndexState
 ChangeVarNodes_callback
 ChangeVarNodes_context
 CheckPoint
+Checkpoint_hook_type
 CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
-- 
2.51.2


From f92abbcc3667103628608d248870867200087e16 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <[email protected]>
Date: Fri, 14 Nov 2025 16:35:21 +0100
Subject: [PATCH 2/2] Testing module

---
 src/test/modules/test_dsm_registry/Makefile   |   1 +
 .../test_dsm_registry/t/001_file_storage.pl   |  31 ++++
 .../test_dsm_registry/test_dsm_registry.c     | 163 ++++++++++++++++++
 3 files changed, 195 insertions(+)
 create mode 100644 src/test/modules/test_dsm_registry/t/001_file_storage.pl

diff --git a/src/test/modules/test_dsm_registry/Makefile b/src/test/modules/test_dsm_registry/Makefile
index b13e99a354f..9aae8b98aba 100644
--- a/src/test/modules/test_dsm_registry/Makefile
+++ b/src/test/modules/test_dsm_registry/Makefile
@@ -10,6 +10,7 @@ EXTENSION = test_dsm_registry
 DATA = test_dsm_registry--1.0.sql
 
 REGRESS = test_dsm_registry
+TAP_TESTS = 1
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/src/test/modules/test_dsm_registry/t/001_file_storage.pl b/src/test/modules/test_dsm_registry/t/001_file_storage.pl
new file mode 100644
index 00000000000..0e82d0adcf7
--- /dev/null
+++ b/src/test/modules/test_dsm_registry/t/001_file_storage.pl
@@ -0,0 +1,31 @@
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+my $node = PostgreSQL::Test::Cluster->new('node');
+
+$node->init();
+$node->append_conf('postgresql.conf',
+							"shared_preload_libraries = 'test_dsm_registry'");
+$node->start();
+
+$node->safe_psql('postgres', "CREATE EXTENSION test_dsm_registry");
+
+my $result;
+
+$node->safe_psql('postgres', "SELECT set_val_in_hash('test-1', '1414')");
+$node->safe_psql('postgres', 'CHECKPOINT');
+$node->safe_psql('postgres', "SELECT set_val_in_hash('test-2', '1415')");
+$node->stop('immediate');
+$node->start();
+
+$result = $node->safe_psql('postgres', "SELECT get_val_in_hash('test-1')");
+is($result, '1414', "Value inserted before the checkpoint was restored");
+$result = $node->safe_psql('postgres', "SELECT get_val_in_hash('test-2')");
+is($result, '', "Value inserted after the checkpoint was lost");
+
+done_testing();
diff --git a/src/test/modules/test_dsm_registry/test_dsm_registry.c b/src/test/modules/test_dsm_registry/test_dsm_registry.c
index 4cc2ccdac3f..2d7fd35a74d 100644
--- a/src/test/modules/test_dsm_registry/test_dsm_registry.c
+++ b/src/test/modules/test_dsm_registry/test_dsm_registry.c
@@ -12,13 +12,22 @@
  */
 #include "postgres.h"
 
+#include "access/xlog.h"
 #include "fmgr.h"
+#include "pgstat.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
 #include "storage/lwlock.h"
 #include "utils/builtins.h"
+#include "utils/hsearch.h"
 
 PG_MODULE_MAGIC;
 
+/* Location of permanent storage file (valid on checkpoint) */
+#define TDR_DUMP_FILE	PGSTAT_STAT_PERMANENT_DIRECTORY "/pg_stat_statements.stat"
+/* Magic number identifying the stats file format */
+static const uint32 TDR_FILE_HEADER = 0x20251114;
+
 typedef struct TestDSMRegistryStruct
 {
 	int			val;
@@ -43,6 +52,11 @@ static const dshash_parameters dsh_params = {
 	dshash_strcpy
 };
 
+static Checkpoint_hook_type	prev_Checkpoint_hook = NULL;
+
+static void load_htab(void);
+static void pgss_Checkpoint(XLogRecPtr checkPointRedo, int flags);
+
 static void
 init_tdr_dsm(void *ptr)
 {
@@ -66,7 +80,14 @@ tdr_attach_shmem(void)
 		tdr_dsa = GetNamedDSA("test_dsm_registry_dsa", &found);
 
 	if (tdr_hash == NULL)
+	{
+		LWLockAcquire(&tdr_dsm->lck, LW_EXCLUSIVE);
 		tdr_hash = GetNamedDSHash("test_dsm_registry_hash", &dsh_params, &found);
+		if (!found)
+			load_htab();
+
+		LWLockRelease(&tdr_dsm->lck);
+	}
 }
 
 PG_FUNCTION_INFO_V1(set_val_in_shmem);
@@ -144,3 +165,145 @@ get_val_in_hash(PG_FUNCTION_ARGS)
 
 	PG_RETURN_TEXT_P(val);
 }
+
+/*
+ * Load any pre-existing entries from file.
+ */
+static void
+load_htab(void)
+{
+	bool	found;
+	FILE   *file = NULL;
+	uint32	header;
+	char   *val = palloc(1);
+
+	Assert(tdr_dsa != NULL && tdr_hash != NULL);
+
+	/*
+	 * Attempt to load old entries from the dump file.
+	 */
+	file = AllocateFile(TDR_DUMP_FILE, PG_BINARY_R);
+	if (file == NULL)
+	{
+		if (errno != ENOENT)
+			goto read_error;
+		/* No existing persisted file, so we're done */
+		return;
+	}
+
+	if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+		header != TDR_FILE_HEADER)
+		goto read_error;
+
+	while (!feof(file))
+	{
+		TestDSMRegistryHashEntry *entry;
+		char	key[64];
+		int		keylen = offsetof(TestDSMRegistryHashEntry, val);
+		int32	vlen;
+
+		if (fread(key, keylen, 1, file) != 1 ||
+			fread(&vlen, sizeof(int32), 1, file) != 1)
+			goto read_error;
+
+		val = repalloc(val, vlen);
+		if (fread(val, vlen, 1, file) != 1)
+			goto read_error;
+
+		Assert(val[vlen - 1] == '\0');
+
+		entry = (TestDSMRegistryHashEntry *)
+								dshash_find_or_insert(tdr_hash, key, &found);
+		Assert(!found);
+
+		entry->val = dsa_allocate(tdr_dsa, strlen(val) + 1);
+		strcpy(dsa_get_address(tdr_dsa, entry->val), val);
+
+		dshash_release_lock(tdr_hash, entry);
+	}
+
+	FreeFile(file);
+	return;
+
+read_error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not read from file \"%s\": %m", TDR_DUMP_FILE)));
+	if (file)
+		FreeFile(file);
+	/* If possible, throw away the bogus file; ignore any error */
+	unlink(TDR_DUMP_FILE);
+}
+
+/*
+ * Dump hash table into file.
+ *
+ */
+static void
+pgss_Checkpoint(XLogRecPtr checkPointRedo, int flags)
+{
+	FILE					   *file;
+	dshash_seq_status			hstat;
+	TestDSMRegistryHashEntry   *entry;
+
+	if (flags & CHECKPOINT_END_OF_RECOVERY)
+		return;
+
+	tdr_attach_shmem();
+
+	file = AllocateFile(TDR_DUMP_FILE ".tmp", PG_BINARY_W);
+	if (file == NULL)
+		goto error;
+	if (fwrite(&TDR_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+		goto error;
+
+	dshash_seq_init(&hstat, tdr_hash, false);
+	while ((entry = dshash_seq_next(&hstat)) != NULL)
+	{
+		int		keylen = offsetof(TestDSMRegistryHashEntry, val);
+		char   *val;
+		int32	vlen;
+
+		val = (char *) dsa_get_address(tdr_dsa, entry->val);
+		vlen = strlen(val) + 1;
+		if (fwrite(entry->key, keylen, 1, file) != 1 ||
+			fwrite(&vlen, sizeof(int32), 1, file) != 1 ||
+			fwrite(val, vlen, 1, file) != 1)
+		{
+			dshash_seq_term(&hstat);
+			goto error;
+		}
+	}
+	dshash_seq_term(&hstat);
+
+	if (FreeFile(file))
+	{
+		file = NULL;
+		goto error;
+	}
+
+	/*
+	 * Rename file into place, so we atomically replace any old one.
+	 */
+	(void) durable_rename(TDR_DUMP_FILE ".tmp", TDR_DUMP_FILE, LOG);
+	return;
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not write file \"%s\": %m",
+					TDR_DUMP_FILE ".tmp")));
+	if (file)
+		FreeFile(file);
+	unlink(TDR_DUMP_FILE ".tmp");
+}
+
+/*
+ * Entry point for this module.
+ */
+void
+_PG_init(void)
+{
+	prev_Checkpoint_hook = Checkpoint_hook;
+	Checkpoint_hook = pgss_Checkpoint;
+}
-- 
2.51.2



Attachments:

  [text/plain] 0001-Add-a-hook-for-Checkpoint-processing.patch (3.4K, 2-0001-Add-a-hook-for-Checkpoint-processing.patch)
  download | inline diff:
From a0e8d75223fa95dbec1e422eacaef336e45c2008 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <[email protected]>
Date: Thu, 13 Nov 2025 15:00:43 +0100
Subject: [PATCH 1/2] Add a hook for Checkpoint processing.

There are many situations in which a Postgres plugin may need to maintain its
internal state across restarts or crashes. Sometimes it wants to synchronise
its state on logical replicas or be saved in a backup employing custom RMGR and
WAL records.

For statistical extensions, such as pg_stat_statements, it is okay to save
their state on postmaster shutdown. However, business extensions may want
to maintain more actual state, periodically dumping it to a disk file or using
WAL and xact callbacks to be as close as possible to the current
database state.

Checkpoint is a key moment where the DBMS performs disk synchronisation and
cuts the WAL. It is a good time to do the same thing for a plugin, too.
Moreover, the plugin is sure that nothing important will be lost with
the WAL cut.

Discussion: https://www.postgresql.org/message-id/CANbhV-E4pTWeF-DsdaGsOrjJNFWPaR%2BDstjrnkqvf9JFFgOKKQ%40mail.gmail.com
---
 src/backend/access/transam/xlog.c | 15 +++++++++++++++
 src/include/access/xlog.h         |  4 ++++
 src/tools/pgindent/typedefs.list  |  1 +
 3 files changed, 20 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 22d0a2e8c3a..c7c0b226724 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -157,6 +157,13 @@ int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
  */
 int			CheckPointSegments;
 
+/*
+ * Hook for plugins to take control during checkpoint processing. All
+ * preparation procedures have already been done, and only the sync needs
+ * to be done.
+ */
+Checkpoint_hook_type Checkpoint_hook = NULL;
+
 /* Estimated distance between checkpoints, in bytes */
 static double CheckPointDistanceEstimate = 0;
 static double PrevCheckPointDistance = 0;
@@ -7594,6 +7601,14 @@ CheckPointGuts(XLogRecPtr checkPointRedo, int flags)
 	CheckPointPredicate();
 	CheckPointBuffers(flags);
 
+	/*
+	 * Allow a plugin that depends on a custom RMGR to retain its state through
+	 * reboots or crashes by following specific steps, ensuring that essential
+	 * WAL records are not truncated.
+	 */
+	if (Checkpoint_hook)
+		Checkpoint_hook(checkPointRedo, flags);
+
 	/* Perform all queued up fsyncs */
 	TRACE_POSTGRESQL_BUFFER_CHECKPOINT_SYNC_START();
 	CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 605280ed8fb..5c071974557 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -198,6 +198,10 @@ typedef enum WALAvailability
 struct XLogRecData;
 struct XLogReaderState;
 
+/* Hook for plugins to get control at the end of a CheckPoint */
+typedef void (*Checkpoint_hook_type)(XLogRecPtr checkPointRedo, int flags);
+extern PGDLLIMPORT Checkpoint_hook_type Checkpoint_hook;
+
 extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 								   XLogRecPtr fpw_lsn,
 								   uint8 flags,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bce72ae64..6ca05499081 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -413,6 +413,7 @@ CatalogIndexState
 ChangeVarNodes_callback
 ChangeVarNodes_context
 CheckPoint
+Checkpoint_hook_type
 CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
-- 
2.51.2



  [text/plain] 0002-Testing-module.patch (6.7K, 3-0002-Testing-module.patch)
  download | inline diff:
From f92abbcc3667103628608d248870867200087e16 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <[email protected]>
Date: Fri, 14 Nov 2025 16:35:21 +0100
Subject: [PATCH 2/2] Testing module

---
 src/test/modules/test_dsm_registry/Makefile   |   1 +
 .../test_dsm_registry/t/001_file_storage.pl   |  31 ++++
 .../test_dsm_registry/test_dsm_registry.c     | 163 ++++++++++++++++++
 3 files changed, 195 insertions(+)
 create mode 100644 src/test/modules/test_dsm_registry/t/001_file_storage.pl

diff --git a/src/test/modules/test_dsm_registry/Makefile b/src/test/modules/test_dsm_registry/Makefile
index b13e99a354f..9aae8b98aba 100644
--- a/src/test/modules/test_dsm_registry/Makefile
+++ b/src/test/modules/test_dsm_registry/Makefile
@@ -10,6 +10,7 @@ EXTENSION = test_dsm_registry
 DATA = test_dsm_registry--1.0.sql
 
 REGRESS = test_dsm_registry
+TAP_TESTS = 1
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/src/test/modules/test_dsm_registry/t/001_file_storage.pl b/src/test/modules/test_dsm_registry/t/001_file_storage.pl
new file mode 100644
index 00000000000..0e82d0adcf7
--- /dev/null
+++ b/src/test/modules/test_dsm_registry/t/001_file_storage.pl
@@ -0,0 +1,31 @@
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+my $node = PostgreSQL::Test::Cluster->new('node');
+
+$node->init();
+$node->append_conf('postgresql.conf',
+							"shared_preload_libraries = 'test_dsm_registry'");
+$node->start();
+
+$node->safe_psql('postgres', "CREATE EXTENSION test_dsm_registry");
+
+my $result;
+
+$node->safe_psql('postgres', "SELECT set_val_in_hash('test-1', '1414')");
+$node->safe_psql('postgres', 'CHECKPOINT');
+$node->safe_psql('postgres', "SELECT set_val_in_hash('test-2', '1415')");
+$node->stop('immediate');
+$node->start();
+
+$result = $node->safe_psql('postgres', "SELECT get_val_in_hash('test-1')");
+is($result, '1414', "Value inserted before the checkpoint was restored");
+$result = $node->safe_psql('postgres', "SELECT get_val_in_hash('test-2')");
+is($result, '', "Value inserted after the checkpoint was lost");
+
+done_testing();
diff --git a/src/test/modules/test_dsm_registry/test_dsm_registry.c b/src/test/modules/test_dsm_registry/test_dsm_registry.c
index 4cc2ccdac3f..2d7fd35a74d 100644
--- a/src/test/modules/test_dsm_registry/test_dsm_registry.c
+++ b/src/test/modules/test_dsm_registry/test_dsm_registry.c
@@ -12,13 +12,22 @@
  */
 #include "postgres.h"
 
+#include "access/xlog.h"
 #include "fmgr.h"
+#include "pgstat.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
 #include "storage/lwlock.h"
 #include "utils/builtins.h"
+#include "utils/hsearch.h"
 
 PG_MODULE_MAGIC;
 
+/* Location of permanent storage file (valid on checkpoint) */
+#define TDR_DUMP_FILE	PGSTAT_STAT_PERMANENT_DIRECTORY "/pg_stat_statements.stat"
+/* Magic number identifying the stats file format */
+static const uint32 TDR_FILE_HEADER = 0x20251114;
+
 typedef struct TestDSMRegistryStruct
 {
 	int			val;
@@ -43,6 +52,11 @@ static const dshash_parameters dsh_params = {
 	dshash_strcpy
 };
 
+static Checkpoint_hook_type	prev_Checkpoint_hook = NULL;
+
+static void load_htab(void);
+static void pgss_Checkpoint(XLogRecPtr checkPointRedo, int flags);
+
 static void
 init_tdr_dsm(void *ptr)
 {
@@ -66,7 +80,14 @@ tdr_attach_shmem(void)
 		tdr_dsa = GetNamedDSA("test_dsm_registry_dsa", &found);
 
 	if (tdr_hash == NULL)
+	{
+		LWLockAcquire(&tdr_dsm->lck, LW_EXCLUSIVE);
 		tdr_hash = GetNamedDSHash("test_dsm_registry_hash", &dsh_params, &found);
+		if (!found)
+			load_htab();
+
+		LWLockRelease(&tdr_dsm->lck);
+	}
 }
 
 PG_FUNCTION_INFO_V1(set_val_in_shmem);
@@ -144,3 +165,145 @@ get_val_in_hash(PG_FUNCTION_ARGS)
 
 	PG_RETURN_TEXT_P(val);
 }
+
+/*
+ * Load any pre-existing entries from file.
+ */
+static void
+load_htab(void)
+{
+	bool	found;
+	FILE   *file = NULL;
+	uint32	header;
+	char   *val = palloc(1);
+
+	Assert(tdr_dsa != NULL && tdr_hash != NULL);
+
+	/*
+	 * Attempt to load old entries from the dump file.
+	 */
+	file = AllocateFile(TDR_DUMP_FILE, PG_BINARY_R);
+	if (file == NULL)
+	{
+		if (errno != ENOENT)
+			goto read_error;
+		/* No existing persisted file, so we're done */
+		return;
+	}
+
+	if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+		header != TDR_FILE_HEADER)
+		goto read_error;
+
+	while (!feof(file))
+	{
+		TestDSMRegistryHashEntry *entry;
+		char	key[64];
+		int		keylen = offsetof(TestDSMRegistryHashEntry, val);
+		int32	vlen;
+
+		if (fread(key, keylen, 1, file) != 1 ||
+			fread(&vlen, sizeof(int32), 1, file) != 1)
+			goto read_error;
+
+		val = repalloc(val, vlen);
+		if (fread(val, vlen, 1, file) != 1)
+			goto read_error;
+
+		Assert(val[vlen - 1] == '\0');
+
+		entry = (TestDSMRegistryHashEntry *)
+								dshash_find_or_insert(tdr_hash, key, &found);
+		Assert(!found);
+
+		entry->val = dsa_allocate(tdr_dsa, strlen(val) + 1);
+		strcpy(dsa_get_address(tdr_dsa, entry->val), val);
+
+		dshash_release_lock(tdr_hash, entry);
+	}
+
+	FreeFile(file);
+	return;
+
+read_error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not read from file \"%s\": %m", TDR_DUMP_FILE)));
+	if (file)
+		FreeFile(file);
+	/* If possible, throw away the bogus file; ignore any error */
+	unlink(TDR_DUMP_FILE);
+}
+
+/*
+ * Dump hash table into file.
+ *
+ */
+static void
+pgss_Checkpoint(XLogRecPtr checkPointRedo, int flags)
+{
+	FILE					   *file;
+	dshash_seq_status			hstat;
+	TestDSMRegistryHashEntry   *entry;
+
+	if (flags & CHECKPOINT_END_OF_RECOVERY)
+		return;
+
+	tdr_attach_shmem();
+
+	file = AllocateFile(TDR_DUMP_FILE ".tmp", PG_BINARY_W);
+	if (file == NULL)
+		goto error;
+	if (fwrite(&TDR_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+		goto error;
+
+	dshash_seq_init(&hstat, tdr_hash, false);
+	while ((entry = dshash_seq_next(&hstat)) != NULL)
+	{
+		int		keylen = offsetof(TestDSMRegistryHashEntry, val);
+		char   *val;
+		int32	vlen;
+
+		val = (char *) dsa_get_address(tdr_dsa, entry->val);
+		vlen = strlen(val) + 1;
+		if (fwrite(entry->key, keylen, 1, file) != 1 ||
+			fwrite(&vlen, sizeof(int32), 1, file) != 1 ||
+			fwrite(val, vlen, 1, file) != 1)
+		{
+			dshash_seq_term(&hstat);
+			goto error;
+		}
+	}
+	dshash_seq_term(&hstat);
+
+	if (FreeFile(file))
+	{
+		file = NULL;
+		goto error;
+	}
+
+	/*
+	 * Rename file into place, so we atomically replace any old one.
+	 */
+	(void) durable_rename(TDR_DUMP_FILE ".tmp", TDR_DUMP_FILE, LOG);
+	return;
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not write file \"%s\": %m",
+					TDR_DUMP_FILE ".tmp")));
+	if (file)
+		FreeFile(file);
+	unlink(TDR_DUMP_FILE ".tmp");
+}
+
+/*
+ * Entry point for this module.
+ */
+void
+_PG_init(void)
+{
+	prev_Checkpoint_hook = Checkpoint_hook;
+	Checkpoint_hook = pgss_Checkpoint;
+}
-- 
2.51.2



view thread (23+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Comments on Custom RMGRs
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox