public inbox for [email protected]  
help / color / mirror / Atom feed
From: Lukas Fittl <[email protected]>
To: Robert Haas <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_plan_advice
Date: Tue, 31 Mar 2026 19:25:04 -0700
Message-ID: <CAP53PkxvjckT=1CokN7-PMcthr0OvWNMrAXTH7dStaXrKi6q2A@mail.gmail.com> (raw)
In-Reply-To: <CA+TgmobnwuyTgYXweQrvOz_kFuBuCbKhsLz-mUPasFw_DQ8f6w@mail.gmail.com>
References: <CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com>
	<CAKZiRmxtJAFG7e1+Vs9B8ngON=AOzJbuws+1ZeH4LsbJh5AzoQ@mail.gmail.com>
	<CA+TgmoY9Ne_Sh10u6LSPc3wvOQPLp3kF9nBp3nqJEG2JGF2QiA@mail.gmail.com>
	<CA+Tgmoa57S6mP=aTOXH2-gDAL4TMO1WbGgrHSg0s6J4zUH=04g@mail.gmail.com>
	<[email protected]>
	<CA+Tgmoaf__2B0BUL+vrg28P+3buX=Ti-kybqkHiLTtFrrCfzuA@mail.gmail.com>
	<CA+TgmoYpcLNOuypOTdgCSLW7FuA=t6BtB3meTARHX2-Dj_81xQ@mail.gmail.com>
	<[email protected]>
	<CA+TgmoZjv9OyFu1Gkt78w0vWEti8S33w8trYHmErf-GMmGSi=w@mail.gmail.com>
	<[email protected]>
	<CA+TgmoaOSBQD9Ux4eG40w723ZN=c0J7p-+oX4+J8urUeyLMo5w@mail.gmail.com>
	<CAOYmi+=g+MMoOpWkk2weXWKJcKH0eKey8gKHHdH0dF4Tiawrhw@mail.gmail.com>
	<CA+TgmobwaT=PXPDDrgDup+jA8KHBbkxigtziD-zNzAKKkQYVgQ@mail.gmail.com>
	<CAOYmi+mOmEW=amDRQMfw6-Fb3ZmDEQFaJzwk8Bc8W8DzaP85XQ@mail.gmail.com>
	<CA+TgmoaX2AMW4cdFM3OngBJxmxpkdmzF33R7-CWhvRLfucbFMg@mail.gmail.com>
	<CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
	<CAP53Pkycc=7N2bLzVT3x+qE1JamvRZWev5tFjdLJ1+-AV3Di+Q@mail.gmail.com>
	<CA+TgmoaKhuD91RnazbRyGkmP7--JdNq8oNDC3UcgTZSWbMxC7w@mail.gmail.com>
	<CAP53Pkw5-wMEeDJXFmqo_RTyL_spzCXb7HHKrbSnQqokVoZcNQ@mail.gmail.com>
	<CA+Tgmob-69bzbdi3U_QtebqAf6u1y8js=5=oNK639csVe1VbhA@mail.gmail.com>
	<CA+TgmoaZMOikxK=LqS+Jn+835h9S139JLGk-3LyETVXw5W5j=w@mail.gmail.com>
	<[email protected]>
	<CAP53PkwZ1ZTMARKg6iEfAw9qzBhkjBitj-9gr_Jvy7k2AwGgWA@mail.gmail.com>
	<CAMbWs4--NuEUFE_xTo991TRXaZryE29jarJPDnVxoaQOYdt7tA@mail.gmail.com>
	<CA+TgmobzR+XMGbRosVPbjHbSo4+cgJn=qZK6w05aF1sbj=C+9Q@mail.gmail.com>
	<CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com>
	<CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com>
	<CAK98qZ2RzbgCHrSg4zLkvpzyBam_X6te-KF8w1+_vON9BAVMEw@mail.gmail.com>
	<CA+TgmoaCdsuvNn6T6SfQ_0YD2Hh2+hgTXh9fTGHQhPg1zvy2rQ@mail.gmail.com>
	<CA+Tgmob7ozJAs5SU6bD2RfAt4w_AmsMGz-aaVA6WeLXHkBypOg@mail.gmail.com>
	<CAK98qZ1J42RoAsHnYWGPPmHziZmzmqE=Lp_O6WJ-9aKK2qjikA@mail.gmail.com>
	<CA+TgmoYjcBA6dw3nwiyfDzPXTCrxTZPXDMrc2TrDJcL1cPK6iA@mail.gmail.com>
	<CA+TgmoYru-vxoTKfwjQby30r2OkTXfb18Km_=VLs6qk8Akr0-g@mail.gmail.com>
	<CA+Tgmoau7yJtvbeH-0kPt1Q=Gt_ezRdgM35Q1=LT665U_86Etg@mail.gmail.com>
	<[email protected]>
	<CA+TgmobOLrMn5jEinWNPL5SrDH1DPpo3a4j+S6-4yhsZwWgzLg@mail.gmail.com>
	<CA+TgmoZUN8FT1Ah=m6Uis5bHa4FUa+_hMDWtcABG17toEfpiUg@mail.gmail.com>
	<CA+TgmoYh2-kM+tscOz=jVYq9Tf4SRPVqzPojs3KLZcW6E9m1BQ@mail.gmail.com>
	<CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com>
	<CAP53Pkz3DSFaaowYvbO5LULf3NhydD_UhHkighfWf6_pwxiqUw@mail.gmail.com>
	<CA+TgmoZ45n5jaNKKgbbj4-kYV8WsPvUn=Z8HnoZ7tUb_p9WKXg@mail.gmail.com>
	<CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com>
	<CAKZiRmwFKhVz-HWvQmgPY7nZES9mCqdXHD++m9b7F4digcJpRQ@mail.gmail.com>
	<CAP53Pkxak127z_Xp7xW5G0+X4FELCBzNyoWCqTQ-xGfLog3KyQ@mail.gmail.com>
	<CA+Tgmoandcm1p=BCwLZPGRG_cno3pKSj4Ejt8AQZ4Hu-yhP1Mw@mail.gmail.com>
	<CAP53PkxJp=9PMzD8mKB3=4fK3Qp1iFZnA4pQSavP0nkpiNyoaA@mail.gmail.com>
	<CA+TgmobnwuyTgYXweQrvOz_kFuBuCbKhsLz-mUPasFw_DQ8f6w@mail.gmail.com>

On Mon, Mar 30, 2026 at 7:53 AM Robert Haas <[email protected]> wrote:
>
> Hi,
>
> Thanks for taking the time to respond. My reading of your comments is
> that we are in overall agreement on the design, with the possible
> exception of persisting data cross restarts. I will write more about
> that topic below; but I think if that's the only design disagreement
> we have, it makes sense to go forward with committing the patch that I
> have, and we can continue to discuss whether we want to add something
> related to persistence. The only reason not to do that would be if
> there were a consensus that the lack of a persistence framework was
> such a critical defect that we shouldn't ship this at all without
> that, but I don't agree with that idea and I think it would be a
> pretty strong position for someone to take.

Yeah, I think my position is that having a solution to persistence
would be very good, but if that's not doable for this release, I think
we have a potential way forward in future releases, at least when it
comes to being restart-safe.

That said, I still think it'll make a big difference in practice to be
restart safe right away, if we can make it happen.

>
> On Sun, Mar 29, 2026 at 2:59 PM Lukas Fittl <[email protected]> wrote:
> > I think a simple disk file is the way to go, similar to how
> > autoprewarm works with its "autoprewarm.blocks" file. Its a bit
> > awkward that that just sits in the main data directory, but since
> > pg_prewarm already does it today, I think its okay to have another
> > contrib module do the same. As noted I'm mainly worried about restarts
> > that the user didn't control, causing advice that was set to be lost.
> >
> > I've attached a patch of how that could look like on top of your v23,
> > that copies the modified stash information to a
> > "pg_stash_advice.entries" file, and loads it after restarts.
>
> I'll be honest: I don't like this design much at all, but I do see the
> practical advantages of it, and we have done similar things elsewhere,
> in particular in autoprewarm. Before I get to the specifics of your
> patch, let me complain about some things that I don't like at the
> design level. We lose a lot by directing data through a bespoke
> mechanism rather than handling it as table data. There are no
> checksums, so we have less protection against corruption. There is no
> write-ahead logging, so data does not make it to standbys, which is
> more of a potential issue for pg_stash_advice than it is for
> autoprewarm. All the code to read and write the file is specific to
> this contrib module, so it can have its own bugs separate from every
> similar module's bugs. The data can't easily be examined and
> manipulated from SQL as table data can. It's just a messy one-off that
> solves a practical problem but is otherwise not very nice. Of course,
> sometimes such messy one-offs are the right answer.

I think if we wanted a table, we should make it a table - but I think
the fact that we want this to be low-overhead for running queries to
examine whether there is anything for them to apply, would require
some kind of cache in front of it, and that gets complicated pretty
quickly.

For the file-based direction, just for reference, I'm attaching an
updated version of that (on top of Robert's earlier v23), that
utilizes a background worker to write out the dump file as needed, at
most every 60 seconds. It also reworks some of the output logic to do
better memory management, and uses a TSV file format that can be
easily read again.

To be clear, I think its okay to go ahead with merging pg_stash_advice
without that and make it a best effort to get the file saving in too -
but I think with the current design in this patch represents a
reasonable solution to what we can do in terms of persistence across
restarts in either 19 or 20.

>
> > Because the number of entries here is controlled by the user (i.e. its
> > not a function of the workload, but a function of how much advice you
> > as a user have set), I'm much less worried about memory usage, as long
> > as we document it clearly how to measure the amount of memory used.
>
> The module doesn't have a built-in way to do that right now. Are you
> thinking we would document that pg_get_dsm_registry_allocations() can
> be used?

Yeah, for example. Alternatively we could provide a function/view that
lists all advice across all stashes, so you can more easily see the
result size of that and estimate what the in-memory use is. But
pointing to pg_get_dsm_registry_allocations seems easier.

> > In practice for a good amount of our user base these days the question
> > will be "Does my cloud provider give me access to create stash
> > entries", so its maybe worth thinking about if we could also allow
> > pg_maintain to manage entries by default?
>
> Wouldn't it make more sense for the cloud provider to grant execute
> permissions on these functions to pg_maintain if they are so inclined?
> This is a brand-new facility, so I think we had better be conservative
> in terms of default permissions.

I guess. I'm always worried that providers get that wrong and forget
to give end users the permissions - but I suppose end users can
complain to their providers if that's the case.

I've done another look over pg_set_stashed_advice and I think its in
good shape. The only trailing thought I have is that we could consider
running a fuzzer against the pg_set_advice function in particular,
just to see if anything pops up (beyond having the ability to make a
very large memory allocation through a large advice string, which is
maybe a problem?).

Thanks,
Lukas

-- 
Lukas Fittl


Attachments:

  [application/octet-stream] vnocfbot-2-0001-Make-pg_stash_advice-dump-advice-to-disk-.patch (18.0K, 2-vnocfbot-2-0001-Make-pg_stash_advice-dump-advice-to-disk-.patch)
  download | inline diff:
From b739ca004ccdede260c541ba8c8dc1aa4e3ea15d Mon Sep 17 00:00:00 2001
From: Lukas Fittl <[email protected]>
Date: Sun, 29 Mar 2026 11:47:27 -0700
Subject: [PATCH vnocfbot-2] Make pg_stash_advice dump advice to disk when
 changed, load after restart

---
 contrib/pg_stash_advice/pg_stash_advice.c | 533 +++++++++++++++++++++-
 doc/src/sgml/pgstashadvice.sgml           |  31 +-
 2 files changed, 559 insertions(+), 5 deletions(-)

diff --git a/contrib/pg_stash_advice/pg_stash_advice.c b/contrib/pg_stash_advice/pg_stash_advice.c
index 22122236694..61fb12a1888 100644
--- a/contrib/pg_stash_advice/pg_stash_advice.c
+++ b/contrib/pg_stash_advice/pg_stash_advice.c
@@ -18,22 +18,37 @@
  */
 #include "postgres.h"
 
+#include <unistd.h>
+
 #include "common/hashfn.h"
 #include "common/string.h"
 #include "fmgr.h"
 #include "funcapi.h"
+#include "miscadmin.h"
 #include "lib/dshash.h"
 #include "nodes/queryjumble.h"
 #include "pg_plan_advice.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/interrupt.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
+#include "storage/ipc.h"
+#include "storage/latch.h"
 #include "storage/lwlock.h"
+#include "storage/proc.h"
+#include "storage/procsignal.h"
+#include "utils/backend_status.h"
 #include "utils/builtins.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/tuplestore.h"
 
+#define PGSA_DUMP_FILE		"pg_stash_advice.tsv"
+
 PG_MODULE_MAGIC;
 
+PGDLLEXPORT void pgsa_stash_advice_main(Datum main_arg);
+
 PG_FUNCTION_INFO_V1(pg_create_advice_stash);
 PG_FUNCTION_INFO_V1(pg_drop_advice_stash);
 PG_FUNCTION_INFO_V1(pg_get_advice_stash_contents);
@@ -50,6 +65,8 @@ typedef struct pgsa_shared_state
 	dsa_handle	area;
 	dshash_table_handle stash_hash;
 	dshash_table_handle entry_hash;
+	ProcNumber	bgworker_proc;
+	bool		dump_requested;
 } pgsa_shared_state;
 
 typedef struct pgsa_stash
@@ -131,8 +148,9 @@ static dshash_parameters pgsa_entry_dshash_parameters = {
 	LWTRANCHE_INVALID			/* gets set at runtime */
 };
 
-/* GUC variable */
+/* GUC variables */
 static char *pg_stash_advice_stash_name = "";
+static bool pg_stash_advice_save = true;
 
 /* Other global variables */
 static MemoryContext pg_stash_advice_mcxt;
@@ -154,6 +172,13 @@ static void pgsa_init_shared_state(void *ptr, void *arg);
 static uint64 pgsa_lookup_stash_id(char *stash_name);
 static void pgsa_set_advice_string(char *stash_name, int64 queryId,
 								   char *advice_string);
+static void pgsa_dump_to_file(void);
+static void pgsa_load_from_file(void);
+static void pgsa_request_dump(void);
+static void pgsa_detach_shmem(int code, Datum arg);
+static void pgsa_start_bgworker(void);
+static char *pgsa_escape_string(char *str);
+static char *pgsa_read_next_field(char **pp);
 
 /*
  * Initialize this module.
@@ -178,6 +203,28 @@ _PG_init(void)
 							   NULL,
 							   NULL);
 
+	if (process_shared_preload_libraries_in_progress)
+	{
+		/* can't define PGC_POSTMASTER variable after startup */
+		DefineCustomBoolVariable("pg_stash_advice.save",
+								 "Save and restore advice stash contents across restarts.",
+								 NULL,
+								 &pg_stash_advice_save,
+								 true,
+								 PGC_POSTMASTER,
+								 0,
+								 NULL,
+								 NULL,
+								 NULL);
+
+		/*
+		 * Register background worker for dumping entries to recover on
+		 * restart, if enabled.
+		 */
+		if (pg_stash_advice_save)
+			pgsa_start_bgworker();
+	}
+
 	MarkGUCPrefixReserved("pg_stash_advice");
 
 	/* Tell pg_plan_advice that we want to provide advice strings. */
@@ -201,6 +248,7 @@ pg_create_advice_stash(PG_FUNCTION_ARGS)
 	LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
 	pgsa_create_stash(stash_name);
 	LWLockRelease(&pgsa_state->lock);
+	pgsa_request_dump();
 	PG_RETURN_VOID();
 }
 
@@ -218,6 +266,7 @@ pg_drop_advice_stash(PG_FUNCTION_ARGS)
 	LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
 	pgsa_drop_stash(stash_name);
 	LWLockRelease(&pgsa_state->lock);
+	pgsa_request_dump();
 	PG_RETURN_VOID();
 }
 
@@ -441,6 +490,7 @@ pg_set_stashed_advice(PG_FUNCTION_ARGS)
 		pgsa_set_advice_string(stash_name, queryId, advice_string);
 	}
 
+	pgsa_request_dump();
 	PG_RETURN_VOID();
 }
 
@@ -796,6 +846,8 @@ pgsa_init_shared_state(void *ptr, void *arg)
 	state->area = DSA_HANDLE_INVALID;
 	state->stash_hash = DSHASH_HANDLE_INVALID;
 	state->entry_hash = DSHASH_HANDLE_INVALID;
+	state->bgworker_proc = INVALID_PROC_NUMBER;
+	state->dump_requested = false;
 }
 
 /*
@@ -898,3 +950,482 @@ pgsa_set_advice_string(char *stash_name, int64 queryId, char *advice_string)
 		dsa_free(pgsa_dsa_area, old_dp);
 	LWLockRelease(&pgsa_state->lock);
 }
+
+/*
+ * Background worker entry point.
+ *
+ * This worker loads the dump file on startup, then waits for dump requests
+ * from backends.  On shutdown, it performs a final dump.
+ */
+void
+pgsa_stash_advice_main(Datum main_arg)
+{
+	/* Establish signal handlers; once that's done, unblock signals. */
+	pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
+	pqsignal(SIGHUP, SignalHandlerForConfigReload);
+	pqsignal(SIGUSR1, procsignal_sigusr1_handler);
+	BackgroundWorkerUnblockSignals();
+
+	/* Set up a session user so pgstat_bestart_final() can report it. */
+	InitializeSessionUserIdStandalone();
+
+	/* Report this worker in pg_stat_activity. */
+	pgstat_beinit();
+	pgstat_bestart_initial();
+	pgstat_bestart_final();
+
+	/* Attach to shared memory structures. */
+	pgsa_attach();
+
+	/*
+	 * Set on-detach hook so that our PID will be cleared on exit.
+	 *
+	 * NB: pg_stash_advice's state is stored in a DSM segment, and DSM
+	 * segments are detached before calling the on_shmem_exit callbacks, so we
+	 * must put pgsa_detach_shmem in the before_shmem_exit callback list.
+	 */
+	before_shmem_exit(pgsa_detach_shmem, 0);
+
+	/*
+	 * Store our PID in shared memory, unless there's already another worker
+	 * running.
+	 */
+	LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+	if (pgsa_state->bgworker_proc != INVALID_PROC_NUMBER)
+	{
+		LWLockRelease(&pgsa_state->lock);
+		ereport(LOG,
+				(errmsg("pg_stash_advice worker is already running under PID %d",
+						(int) GetPGProcByNumber(pgsa_state->bgworker_proc)->pid)));
+		return;
+	}
+	pgsa_state->bgworker_proc = MyProcNumber;
+	LWLockRelease(&pgsa_state->lock);
+
+	/* Load previously saved stash data from disk. */
+	pgsa_load_from_file();
+
+	/* Dump when requested, until shutdown. */
+	while (!ShutdownRequestPending)
+	{
+		bool		dump_requested = false;
+
+		/* In case of a SIGHUP, just reload the configuration. */
+		if (ConfigReloadPending)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+		}
+
+		/* Check whether a dump has been requested. */
+		LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+		if (pgsa_state->dump_requested)
+		{
+			pgsa_state->dump_requested = false;
+			dump_requested = true;
+		}
+		LWLockRelease(&pgsa_state->lock);
+
+		if (dump_requested)
+			pgsa_dump_to_file();
+
+		/*
+		 * Sleep for up to 60 seconds before checking again.  This ensures we
+		 * coalesce multiple rapid changes into a single dump.
+		 */
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 60000L,
+						 PG_WAIT_EXTENSION);
+
+		ResetLatch(MyLatch);
+	}
+
+	/* Perform a final dump before exiting. */
+	pgsa_dump_to_file();
+}
+
+/*
+ * Signal the background worker to dump stash data to disk.
+ */
+static void
+pgsa_request_dump(void)
+{
+	LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+	pgsa_state->dump_requested = true;
+	LWLockRelease(&pgsa_state->lock);
+}
+
+/*
+ * Clear our PID from shared memory on exit.
+ */
+static void
+pgsa_detach_shmem(int code, Datum arg)
+{
+	LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+	if (pgsa_state->bgworker_proc == MyProcNumber)
+		pgsa_state->bgworker_proc = INVALID_PROC_NUMBER;
+	LWLockRelease(&pgsa_state->lock);
+}
+
+/*
+ * Register the background worker.
+ */
+static void
+pgsa_start_bgworker(void)
+{
+	BackgroundWorker worker = {0};
+
+	worker.bgw_flags = BGWORKER_SHMEM_ACCESS;
+	worker.bgw_start_time = BgWorkerStart_ConsistentState;
+	strcpy(worker.bgw_library_name, "pg_stash_advice");
+	strcpy(worker.bgw_function_name, "pgsa_stash_advice_main");
+	strcpy(worker.bgw_name, "pg_stash_advice worker");
+	strcpy(worker.bgw_type, "pg_stash_advice worker");
+
+	RegisterBackgroundWorker(&worker);
+}
+
+/*
+ * Dump all advice stash data to a file.
+ *
+ * The file format is a simple TSV with a line-type prefix:
+ *   stash\tstash_name
+ *   entry\tstash_name\tquery_id\tadvice_string
+ *
+ * Stash names and advice strings are backslash-escaped where needed.
+ */
+static void
+pgsa_dump_to_file(void)
+{
+	FILE	   *file;
+	char		transient_dump_file_path[MAXPGPATH];
+	dshash_seq_status iter;
+	pgsa_stash *stash;
+	pgsa_entry *entry;
+	pgsa_stash_name_table_hash *nhash;
+	int			ret = 0;
+	MemoryContext tmpcxt;
+	MemoryContext oldcxt;
+
+	Assert(pgsa_entry_dshash != NULL);
+
+	/* Use a temporary context so all allocations are freed at the end. */
+	tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
+								   "pg_stash_advice dump",
+								   ALLOCSET_DEFAULT_SIZES);
+	oldcxt = MemoryContextSwitchTo(tmpcxt);
+
+	/* Open a temporary file for writing. */
+	snprintf(transient_dump_file_path, MAXPGPATH, "%s.tmp", PGSA_DUMP_FILE);
+	file = AllocateFile(transient_dump_file_path, "w");
+	if (!file)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not open file \"%s\": %m",
+						transient_dump_file_path)));
+
+	/* Build an ID->name lookup table for writing entry lines. */
+	nhash = pgsa_stash_name_table_create(tmpcxt, 64, NULL);
+
+	/* Write stash lines. */
+	dshash_seq_init(&iter, pgsa_stash_dshash, true);
+	while ((stash = dshash_seq_next(&iter)) != NULL)
+	{
+		pgsa_stash_name *n;
+		bool		found;
+
+		n = pgsa_stash_name_table_insert(nhash, stash->pgsa_stash_id, &found);
+		Assert(!found);
+		n->name = pstrdup(stash->name);
+		ret = fprintf(file, "stash\t%s\n", pgsa_escape_string(n->name));
+		if (ret < 0)
+			break;
+	}
+	dshash_seq_term(&iter);
+
+	/* Write entry lines. */
+	if (ret >= 0)
+	{
+		dshash_seq_init(&iter, pgsa_entry_dshash, true);
+		while ((entry = dshash_seq_next(&iter)) != NULL)
+		{
+			pgsa_stash_name *n;
+			char	   *advice_string;
+
+			if (entry->advice_string == InvalidDsaPointer)
+				continue;
+
+			n = pgsa_stash_name_table_lookup(nhash, entry->key.pgsa_stash_id);
+			if (n == NULL)
+				continue;		/* orphan entry, skip */
+
+			advice_string = dsa_get_address(pgsa_dsa_area,
+											entry->advice_string);
+			ret = fprintf(file, "entry\t%s\t%" PRId64 "\t%s\n",
+						  pgsa_escape_string(n->name),
+						  entry->key.queryId,
+						  pgsa_escape_string(advice_string));
+			if (ret < 0)
+				break;
+		}
+		dshash_seq_term(&iter);
+	}
+
+	/* Handle any write error. */
+	if (ret < 0)
+	{
+		int			save_errno = errno;
+
+		FreeFile(file);
+		unlink(transient_dump_file_path);
+		errno = save_errno;
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not write to file \"%s\": %m",
+						transient_dump_file_path)));
+	}
+
+	/* Close the file and rename it into place atomically. */
+	ret = FreeFile(file);
+	if (ret != 0)
+	{
+		int			save_errno = errno;
+
+		unlink(transient_dump_file_path);
+		errno = save_errno;
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not close file \"%s\": %m",
+						transient_dump_file_path)));
+	}
+
+	(void) durable_rename(transient_dump_file_path, PGSA_DUMP_FILE, ERROR);
+
+	MemoryContextSwitchTo(oldcxt);
+	MemoryContextDelete(tmpcxt);
+}
+
+/*
+ * Load advice stash data from the dump file.
+ *
+ * This is called once when the shared memory state is first initialized
+ * (i.e. after a server restart or crash recovery), to restore the previously
+ * saved stash contents.
+ *
+ * Errors during loading are reported as warnings so that a corrupt dump file
+ * does not prevent the server from starting.
+ */
+static void
+pgsa_load_from_file(void)
+{
+	FILE	   *file;
+	int			num_stashes = 0;
+	int			num_entries = 0;
+	int			num_malformed = 0;
+	char	   *line;
+
+	file = AllocateFile(PGSA_DUMP_FILE, "r");
+	if (!file)
+	{
+		if (errno == ENOENT)
+			return;				/* no dump file, nothing to load */
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not open file \"%s\": %m", PGSA_DUMP_FILE)));
+		return;
+	}
+
+	/* Read lines until EOF. */
+	while ((line = pg_get_line(file, NULL)) != NULL)
+	{
+		char	   *p = line;
+		char	   *line_type;
+
+		/* Strip the trailing newline. */
+		pg_strip_crlf(line);
+
+		/* Split off the line type prefix (unescaped, plain keyword). */
+		line_type = pgsa_read_next_field(&p);
+		if (line_type == NULL)
+		{
+			num_malformed++;
+			pfree(line);
+			continue;
+		}
+
+		if (strcmp(line_type, "stash") == 0)
+		{
+			char	   *name = pgsa_read_next_field(&p);
+
+			if (name != NULL)
+			{
+				/*
+				 * Skip duplicates rather than ERRORing like
+				 * pgsa_create_stash.
+				 */
+				if (pgsa_lookup_stash_id(name) == 0)
+				{
+					LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+					pgsa_create_stash(name);
+					LWLockRelease(&pgsa_state->lock);
+				}
+				num_stashes++;
+				pfree(name);
+			}
+			else
+				num_malformed++;
+		}
+		else if (strcmp(line_type, "entry") == 0)
+		{
+			char	   *stash_name;
+			char	   *queryid_str;
+			char	   *advice_string;
+			int64		queryId;
+			ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+			stash_name = pgsa_read_next_field(&p);
+			queryid_str = pgsa_read_next_field(&p);
+			advice_string = pgsa_read_next_field(&p);
+
+			if (stash_name == NULL || queryid_str == NULL ||
+				advice_string == NULL)
+			{
+				num_malformed++;
+				if (stash_name)
+					pfree(stash_name);
+				if (queryid_str)
+					pfree(queryid_str);
+				if (advice_string)
+					pfree(advice_string);
+				pfree(line_type);
+				pfree(line);
+				continue;
+			}
+
+			queryId = pg_strtoint64_safe(queryid_str, (Node *) &escontext);
+			if (!SOFT_ERROR_OCCURRED(&escontext))
+			{
+				pgsa_set_advice_string(stash_name, queryId, advice_string);
+				num_entries++;
+			}
+			else
+				num_malformed++;
+
+			pfree(stash_name);
+			pfree(queryid_str);
+			pfree(advice_string);
+		}
+		else
+		{
+			num_malformed++;
+		}
+		pfree(line_type);
+		pfree(line);
+	}
+
+	FreeFile(file);
+
+	if (num_malformed > 0)
+		ereport(WARNING,
+				errmsg("skipped %d malformed advice lines on load",
+					   num_malformed));
+
+	ereport(LOG,
+			errmsg("loaded %d advice stashes with %d entries",
+				   num_stashes, num_entries));
+}
+
+/*
+ * Backslash-escape the string so it can be written to a tab-separated file.
+ *
+ * The escaped characters are backslash, tab, and newline.
+ */
+static char *
+pgsa_escape_string(char *str)
+{
+	StringInfoData buf;
+
+	if (!strpbrk(str, "\\\t\n"))
+		return str;
+
+	initStringInfo(&buf);
+	for (const char *p = str; *p; p++)
+	{
+		switch (*p)
+		{
+			case '\\':
+				appendStringInfoString(&buf, "\\\\");
+				break;
+			case '\t':
+				appendStringInfoString(&buf, "\\t");
+				break;
+			case '\n':
+				appendStringInfoString(&buf, "\\n");
+				break;
+			case '\r':
+				appendStringInfoString(&buf, "\\r");
+				break;
+			default:
+				appendStringInfoChar(&buf, *p);
+				break;
+		}
+	}
+
+	return buf.data;
+}
+
+/*
+ * Read the next tab-delimited field from *pp, unescaping backslash sequences
+ * as we go.  Advances *pp past the tab delimiter (or to end of string).
+ *
+ * Returns a palloc'd string with the unescaped field value, or NULL if there
+ * are no more fields (i.e. *pp already points to '\0').
+ */
+static char *
+pgsa_read_next_field(char **pp)
+{
+	StringInfoData buf;
+	const char *p = *pp;
+
+	if (*p == '\0')
+		return NULL;
+
+	initStringInfo(&buf);
+	while (*p != '\0' && *p != '\t')
+	{
+		if (*p == '\\' && p[1] != '\0')
+		{
+			p++;
+			switch (*p)
+			{
+				case '\\':
+					appendStringInfoChar(&buf, '\\');
+					break;
+				case 't':
+					appendStringInfoChar(&buf, '\t');
+					break;
+				case 'n':
+					appendStringInfoChar(&buf, '\n');
+					break;
+				case 'r':
+					appendStringInfoChar(&buf, '\r');
+					break;
+				default:
+					/* Unrecognized escape; keep as-is. */
+					appendStringInfoChar(&buf, *p);
+					break;
+			}
+		}
+		else
+			appendStringInfoChar(&buf, *p);
+		p++;
+	}
+
+	/* Skip the tab delimiter if present. */
+	if (*p == '\t')
+		p++;
+
+	*pp = (char *) p;
+	return buf.data;
+}
diff --git a/doc/src/sgml/pgstashadvice.sgml b/doc/src/sgml/pgstashadvice.sgml
index 089fc66446f..937d31e557b 100644
--- a/doc/src/sgml/pgstashadvice.sgml
+++ b/doc/src/sgml/pgstashadvice.sgml
@@ -15,10 +15,12 @@
   <link linkend="guc-compute-query-id">query identifiers</link> to plan advice
   strings. Whenever a session is asked to plan a query whose query ID appears
   in the relevant advice stash, the plan advice string is automatically applied
-  to guide planning. Note that advice stashes exist purely in memory. This
-  means both that it is important to be mindful of memory consumption when
-  deciding how much plan advice to stash, and also that advice stashes must
-  be recreated and repopulated whenever the server is restarted.
+  to guide planning. Advice stashes are held in memory, so it is important
+  to be mindful of memory consumption when deciding how much plan advice to
+  stash. The contents are automatically saved to a file called
+  <filename>pg_stash_advice.tsv</filename> whenever they are modified,
+  and restored when the first session attaches after a server restart
+  (including after a crash).
  </para>
 
  <para>
@@ -203,6 +205,27 @@
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <varname>pg_stash_advice.save</varname> (<type>boolean</type>)
+     <indexterm>
+      <primary><varname>pg_stash_advice.save</varname> configuration parameter</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      Specifies whether to save advice stash contents to disk so that they
+      can be restored after a server restart (including after a crash).
+      When enabled, a background worker checks every 60 seconds for changes
+      and writes stash contents to a file called
+      <filename>pg_stash_advice.tsv</filename> in the data directory.
+      The default value is <literal>on</literal>.  This parameter can only
+      be set at server start.
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
 
  </sect2>
-- 
2.47.1



view thread (184+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_plan_advice
  In-Reply-To: <CAP53PkxvjckT=1CokN7-PMcthr0OvWNMrAXTH7dStaXrKi6q2A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox