public inbox for [email protected]
help / color / mirror / Atom feedFrom: Robert Haas <[email protected]>
To: Tom Lane <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_plan_advice
Date: Tue, 24 Mar 2026 17:09:51 -0400
Message-ID: <CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com> (raw)
In-Reply-To: <CA+TgmoYh2-kM+tscOz=jVYq9Tf4SRPVqzPojs3KLZcW6E9m1BQ@mail.gmail.com>
References: <CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com>
<CAKZiRmxtJAFG7e1+Vs9B8ngON=AOzJbuws+1ZeH4LsbJh5AzoQ@mail.gmail.com>
<CA+TgmoY9Ne_Sh10u6LSPc3wvOQPLp3kF9nBp3nqJEG2JGF2QiA@mail.gmail.com>
<CA+Tgmoa57S6mP=aTOXH2-gDAL4TMO1WbGgrHSg0s6J4zUH=04g@mail.gmail.com>
<[email protected]>
<CA+Tgmoaf__2B0BUL+vrg28P+3buX=Ti-kybqkHiLTtFrrCfzuA@mail.gmail.com>
<CA+TgmoYpcLNOuypOTdgCSLW7FuA=t6BtB3meTARHX2-Dj_81xQ@mail.gmail.com>
<[email protected]>
<CA+TgmoZjv9OyFu1Gkt78w0vWEti8S33w8trYHmErf-GMmGSi=w@mail.gmail.com>
<[email protected]>
<CA+TgmoaOSBQD9Ux4eG40w723ZN=c0J7p-+oX4+J8urUeyLMo5w@mail.gmail.com>
<CAOYmi+=g+MMoOpWkk2weXWKJcKH0eKey8gKHHdH0dF4Tiawrhw@mail.gmail.com>
<CA+TgmobwaT=PXPDDrgDup+jA8KHBbkxigtziD-zNzAKKkQYVgQ@mail.gmail.com>
<CAOYmi+mOmEW=amDRQMfw6-Fb3ZmDEQFaJzwk8Bc8W8DzaP85XQ@mail.gmail.com>
<CA+TgmoaX2AMW4cdFM3OngBJxmxpkdmzF33R7-CWhvRLfucbFMg@mail.gmail.com>
<CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
<CAP53Pkycc=7N2bLzVT3x+qE1JamvRZWev5tFjdLJ1+-AV3Di+Q@mail.gmail.com>
<CA+TgmoaKhuD91RnazbRyGkmP7--JdNq8oNDC3UcgTZSWbMxC7w@mail.gmail.com>
<CAP53Pkw5-wMEeDJXFmqo_RTyL_spzCXb7HHKrbSnQqokVoZcNQ@mail.gmail.com>
<CA+Tgmob-69bzbdi3U_QtebqAf6u1y8js=5=oNK639csVe1VbhA@mail.gmail.com>
<CA+TgmoaZMOikxK=LqS+Jn+835h9S139JLGk-3LyETVXw5W5j=w@mail.gmail.com>
<[email protected]>
<CAP53PkwZ1ZTMARKg6iEfAw9qzBhkjBitj-9gr_Jvy7k2AwGgWA@mail.gmail.com>
<CAMbWs4--NuEUFE_xTo991TRXaZryE29jarJPDnVxoaQOYdt7tA@mail.gmail.com>
<CA+TgmobzR+XMGbRosVPbjHbSo4+cgJn=qZK6w05aF1sbj=C+9Q@mail.gmail.com>
<CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com>
<CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com>
<CAK98qZ2RzbgCHrSg4zLkvpzyBam_X6te-KF8w1+_vON9BAVMEw@mail.gmail.com>
<CA+TgmoaCdsuvNn6T6SfQ_0YD2Hh2+hgTXh9fTGHQhPg1zvy2rQ@mail.gmail.com>
<CA+Tgmob7ozJAs5SU6bD2RfAt4w_AmsMGz-aaVA6WeLXHkBypOg@mail.gmail.com>
<CAK98qZ1J42RoAsHnYWGPPmHziZmzmqE=Lp_O6WJ-9aKK2qjikA@mail.gmail.com>
<CA+TgmoYjcBA6dw3nwiyfDzPXTCrxTZPXDMrc2TrDJcL1cPK6iA@mail.gmail.com>
<CA+TgmoYru-vxoTKfwjQby30r2OkTXfb18Km_=VLs6qk8Akr0-g@mail.gmail.com>
<CA+Tgmoau7yJtvbeH-0kPt1Q=Gt_ezRdgM35Q1=LT665U_86Etg@mail.gmail.com>
<[email protected]>
<CA+TgmobOLrMn5jEinWNPL5SrDH1DPpo3a4j+S6-4yhsZwWgzLg@mail.gmail.com>
<CA+TgmoZUN8FT1Ah=m6Uis5bHa4FUa+_hMDWtcABG17toEfpiUg@mail.gmail.com>
<CA+TgmoYh2-kM+tscOz=jVYq9Tf4SRPVqzPojs3KLZcW6E9m1BQ@mail.gmail.com>
On Sat, Mar 21, 2026 at 9:13 AM Robert Haas <[email protected]> wrote:
> So I'm left with the idea that to get test_plan_advice to be fully
> stable on these slower machines, it will probably be necessary to make
> it control which AlternativeSubPlan is chosen and whether a
> MinMaxAggPath is chosen or not. I have some ideas about how to
> accomplish that in a reasonably elegant fashion without adding too
> much new machinery, but I need to spend some more time validating
> those ideas before committing to a precise course of action. More
> soon.
Here is v22. There are four new patches.
0001 adds a disabled_nodes fields to SubPlan, to fix the bug that I
identified in the email to which this is a reply.
0002-0004 are an attempt to fix the remaining buildfarm failures not
already addressed (or attempted to be addressed, anyway) by other
commits. The basic idea, implemented by 0004, is to add a
DO_NOT_SCAN() advice tag. This advice is generated when we consider a
MinMaxAggPath or a hashed SubPlan. In either case, all relations which
are part of the non-selected alternative are marked DO_NOT_SCAN(),
which works like scan type advice but disables every possible scan
type rather than still allowing exactly one of them. Unless I've
missed something, this should be sufficient to make pg_plan_advice
stabilize which of two alternative SubPlans we pick and whether or not
a min/max aggregate is chosen. 0002 does some preliminary refactoring
to provide a more centralized way of tracking per-PlannerInfo details
within pg_plan_advice. 0003 makes the necessary change to
src/backend/optimizer, which consists of adding an alternative_root
field to each PlannerInfo and setting it appropriately. 0004 then
updates pg_plan_advice to implement DO_NOT_SCAN().
0005 is the pg_collect_advice module from previous versions of the
patch set. The main change here is that I completely rewrote the TAP
test, which previously was running the entire regression test suite
yet another time. That's been replaced with something that is much
faster and much better targeted at properly testing the shared advice
collector. Aside from that, I added one more check for
InvalidDsaPointer where the code was previously lacking one.
0006 is the pg_stash_advice module from previous versions of the patch
set. I have adjusted this to be much safer against permanent DSA
leaks. It now uses dshash_find_or_insert_extended instead of relying
on the ability to dshash_find a just-inserted entry without error. It
now also holds an LWLock while inserting or updating an entry in the
dshash table, for reasons explained in the comments. On the other
hand, it no longer unnecessarily holds the LWLock in exclusive mode
when looking up advice strings for automatic application, which was a
rather silly mistake in the previous version. A few additional tests
have been added. Alphabetization in contrib/Makefile has been fixed.
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
[application/octet-stream] v22-0003-Add-an-alternative_root-field-to-PlannerInfo.patch (8.8K, 2-v22-0003-Add-an-alternative_root-field-to-PlannerInfo.patch)
download | inline diff:
From 9bc42f4dbb4afbe6f69439b929f9ec0c7501277e Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Mon, 23 Mar 2026 08:49:44 -0400
Subject: [PATCH v22 3/6] Add an alternative_root field to PlannerInfo.
Typically, we have only one PlannerInfo for any given subquery, but
when we are considering a MinMaxAggPath or a hashed subplan, we end
up creating a second PlannerInfo for the same portion of the query,
with a clone of the original range table. In fact, in the MinMaxAggPath
case, we might end up creating several clones, one per aggregate.
At present, there's no easy way for a plugin, such as pg_plan_advice,
to understand the relationships between the original range table and
the copies of it that are created in these cases. To fix, add an
alternative_root field to PlannerInfo. For a hashed subplan, this
points back to the PlannerInfo for the non-hashed alternative; for
minmax aggregates, this points back to the parent PlannerInfo; in
other cases, it's just NULL.
---
src/backend/optimizer/path/allpaths.c | 2 +-
src/backend/optimizer/plan/planagg.c | 1 +
src/backend/optimizer/plan/planner.c | 12 ++++++++----
src/backend/optimizer/plan/subselect.c | 6 +++---
src/backend/optimizer/prep/prepjointree.c | 1 +
src/backend/optimizer/prep/prepunion.c | 2 +-
src/include/nodes/pathnodes.h | 11 +++++++++++
src/include/optimizer/planner.h | 1 +
8 files changed, 27 insertions(+), 9 deletions(-)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index c26f48edfa0..61093f222a1 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2833,7 +2833,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Generate a subroot and Paths for the subquery */
plan_name = choose_plan_name(root->glob, rte->eref->aliasname, false);
rel->subroot = subquery_planner(root->glob, subquery, plan_name,
- root, false, tuple_fraction, NULL);
+ root, NULL, false, tuple_fraction, NULL);
/* Isolate the params needed by this specific subplan */
rel->subplan_params = root->plan_params;
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index 09b38b2c378..559a8c14e88 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -340,6 +340,7 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
memcpy(subroot, root, sizeof(PlannerInfo));
subroot->query_level++;
subroot->parent_root = root;
+ subroot->alternative_root = root;
subroot->plan_name = choose_plan_name(root->glob, "minmax", true);
/* reset subplan-related stuff */
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 42604a0f75c..9dcf49c0055 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -515,8 +515,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
&tuple_fraction, es);
/* primary planning entry point (may recurse for subqueries) */
- root = subquery_planner(glob, parse, NULL, NULL, false, tuple_fraction,
- NULL);
+ root = subquery_planner(glob, parse, NULL, NULL, NULL, false,
+ tuple_fraction, NULL);
/* Select best Path and turn it into a Plan */
final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
@@ -715,6 +715,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
* parse is the querytree produced by the parser & rewriter.
* plan_name is the name to assign to this subplan (NULL at the top level).
* parent_root is the immediate parent Query's info (NULL at the top level).
+ * alternative_root is a previously created PlannerInfo for which this query
+ * level is an alternative implementation, or else NULL.
* hasRecursion is true if this is a recursive WITH query.
* tuple_fraction is the fraction of tuples we expect will be retrieved.
* tuple_fraction is interpreted as explained for grouping_planner, below.
@@ -741,8 +743,9 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
*/
PlannerInfo *
subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
- PlannerInfo *parent_root, bool hasRecursion,
- double tuple_fraction, SetOperationStmt *setops)
+ PlannerInfo *parent_root, PlannerInfo *alternative_root,
+ bool hasRecursion, double tuple_fraction,
+ SetOperationStmt *setops)
{
PlannerInfo *root;
List *newWithCheckOptions;
@@ -759,6 +762,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
root->query_level = parent_root ? parent_root->query_level + 1 : 1;
root->plan_name = plan_name;
root->parent_root = parent_root;
+ root->alternative_root = alternative_root;
root->plan_params = NIL;
root->outer_params = NULL;
root->planner_cxt = CurrentMemoryContext;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 0d31861da7f..ccec1eaa7fe 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -224,7 +224,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
/* Generate Paths for the subquery */
subroot = subquery_planner(root->glob, subquery,
choose_plan_name(root->glob, sublinkstr, true),
- root, false, tuple_fraction, NULL);
+ root, NULL, false, tuple_fraction, NULL);
/* Isolate the params needed by this specific subplan */
plan_params = root->plan_params;
@@ -274,7 +274,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
/* Generate Paths for the ANY subquery; we'll need all rows */
plan_name = choose_plan_name(root->glob, sublinkstr, true);
subroot = subquery_planner(root->glob, subquery, plan_name,
- root, false, 0.0, NULL);
+ root, subroot, false, 0.0, NULL);
/* Isolate the params needed by this specific subplan */
plan_params = root->plan_params;
@@ -971,7 +971,7 @@ SS_process_ctes(PlannerInfo *root)
*/
subroot = subquery_planner(root->glob, subquery,
choose_plan_name(root->glob, cte->ctename, false),
- root, cte->cterecursive, 0.0, NULL);
+ root, NULL, cte->cterecursive, 0.0, NULL);
/*
* Since the current query level doesn't yet contain any RTEs, it
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index d5e1041ffa3..8c0e6a61c96 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1419,6 +1419,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
subroot->query_level = root->query_level;
subroot->plan_name = root->plan_name;
subroot->parent_root = root->parent_root;
+ subroot->alternative_root = root->alternative_root;
subroot->plan_params = NIL;
subroot->outer_params = NULL;
subroot->planner_cxt = CurrentMemoryContext;
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 583cb0b7a25..d1f022c5bfd 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -250,7 +250,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
*/
plan_name = choose_plan_name(root->glob, "setop", true);
subroot = rel->subroot = subquery_planner(root->glob, subquery,
- plan_name, root,
+ plan_name, root, NULL,
false, root->tuple_fraction,
parentOp);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27758ec16fe..58ce19d1d21 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -317,6 +317,17 @@ struct PlannerInfo
/* NULL at outermost Query */
PlannerInfo *parent_root pg_node_attr(read_write_ignore);
+ /*
+ * If this PlannerInfo exists to consider an alternative implementation
+ * strategy for a portion of the query that could also be implemented by
+ * some other PlannerInfo, this points to that other PlannerInfo. When
+ * we are considering the first or only alternative, it is NULL.
+ *
+ * Currently, we use this when considering a MinMaxAggPath or a hashed
+ * SubPlan.
+ */
+ PlannerInfo *alternative_root pg_node_attr(read_write_ignore);
+
/* Subplan name for EXPLAIN and debugging purposes (NULL at top level) */
char *plan_name;
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index 80509773c01..9c4950b340f 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -63,6 +63,7 @@ extern PlannedStmt *standard_planner(Query *parse, const char *query_string,
extern PlannerInfo *subquery_planner(PlannerGlobal *glob, Query *parse,
char *plan_name,
PlannerInfo *parent_root,
+ PlannerInfo *alternative_root,
bool hasRecursion, double tuple_fraction,
SetOperationStmt *setops);
--
2.51.0
[application/octet-stream] v22-0002-pg_plan_advice-Refactor-to-invent-pgpa_planner_i.patch (20.0K, 3-v22-0002-pg_plan_advice-Refactor-to-invent-pgpa_planner_i.patch)
download | inline diff:
From 9436e9c94891e00c7db06bf9918fb68a4c451b66 Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Tue, 24 Mar 2026 07:19:04 -0400
Subject: [PATCH v22 2/6] pg_plan_advice: Refactor to invent pgpa_planner_info
pg_plan_advice tracks two pieces of per-PlannerInfo data: (1) for each
RTI, the corresponding relation identifier, for purposes of
cross-checking those calculations against the final plan; and (2) the
set of semijoins seen during planning for which the strategy of making
one side unique was considered. The former is tracked using a hash
table that uses <plan_name, RTI> as the key, and the latter is
tracked using a List of <plan_name, relids>.
It seems better to track both of these things in the same way and
to try to reuse some code instead of having everything be completely
separate, so invent pgpa_planner_info; we'll create one every time we
see a new PlannerInfo and need to associate some data with it, and
we'll use the plan_name field to distinguish between PlannerInfo
objects, as it should always be unique. Then, refactor the two
systems mentioned above to use this new infrastructure.
(Note that the adjustment in pgpa_plan_walker is necessary in order
to avoid spuriously triggering the sanity check in that function,
in the case where a pgpa_planner_info is created for a purpose not
related to sj_unique_rels.)
---
contrib/pg_plan_advice/pgpa_planner.c | 252 ++++++++++++--------------
contrib/pg_plan_advice/pgpa_planner.h | 38 ++++
contrib/pg_plan_advice/pgpa_walker.c | 36 ++--
contrib/pg_plan_advice/pgpa_walker.h | 20 +-
src/tools/pgindent/typedefs.list | 4 +-
5 files changed, 173 insertions(+), 177 deletions(-)
diff --git a/contrib/pg_plan_advice/pgpa_planner.c b/contrib/pg_plan_advice/pgpa_planner.c
index fee88904760..70139ff42be 100644
--- a/contrib/pg_plan_advice/pgpa_planner.c
+++ b/contrib/pg_plan_advice/pgpa_planner.c
@@ -34,53 +34,6 @@
#include "parser/parsetree.h"
#include "utils/lsyscache.h"
-#ifdef USE_ASSERT_CHECKING
-
-/*
- * When assertions are enabled, we try generating relation identifiers during
- * planning, saving them in a hash table, and then cross-checking them against
- * the ones generated after planning is complete.
- */
-typedef struct pgpa_ri_checker_key
-{
- char *plan_name;
- Index rti;
-} pgpa_ri_checker_key;
-
-typedef struct pgpa_ri_checker
-{
- pgpa_ri_checker_key key;
- uint32 status;
- const char *rid_string;
-} pgpa_ri_checker;
-
-static uint32 pgpa_ri_checker_hash_key(pgpa_ri_checker_key key);
-
-static inline bool
-pgpa_ri_checker_compare_key(pgpa_ri_checker_key a, pgpa_ri_checker_key b)
-{
- if (a.rti != b.rti)
- return false;
- if (a.plan_name == NULL)
- return (b.plan_name == NULL);
- if (b.plan_name == NULL)
- return false;
- return strcmp(a.plan_name, b.plan_name) == 0;
-}
-
-#define SH_PREFIX pgpa_ri_check
-#define SH_ELEMENT_TYPE pgpa_ri_checker
-#define SH_KEY_TYPE pgpa_ri_checker_key
-#define SH_KEY key
-#define SH_HASH_KEY(tb, key) pgpa_ri_checker_hash_key(key)
-#define SH_EQUAL(tb, a, b) pgpa_ri_checker_compare_key(a, b)
-#define SH_SCOPE static inline
-#define SH_DECLARE
-#define SH_DEFINE
-#include "lib/simplehash.h"
-
-#endif
-
typedef enum pgpa_jo_outcome
{
PGPA_JO_PERMITTED, /* permit this join order */
@@ -94,11 +47,8 @@ typedef struct pgpa_planner_state
bool generate_advice_feedback;
bool generate_advice_string;
pgpa_trove *trove;
- List *sj_unique_rels;
-
-#ifdef USE_ASSERT_CHECKING
- pgpa_ri_check_hash *ri_check_hash;
-#endif
+ List *proots;
+ pgpa_planner_info *last_proot;
} pgpa_planner_state;
typedef struct pgpa_join_state
@@ -211,6 +161,9 @@ static List *pgpa_planner_append_feedback(List *list, pgpa_trove *trove,
pgpa_plan_walker_context *walker);
static void pgpa_planner_feedback_warning(List *feedback);
+static pgpa_planner_info *pgpa_planner_get_proot(pgpa_planner_state *pps,
+ PlannerInfo *root);
+
static inline void pgpa_ri_checker_save(pgpa_planner_state *pps,
PlannerInfo *root,
RelOptInfo *rel);
@@ -340,10 +293,6 @@ pgpa_planner_setup(PlannerGlobal *glob, Query *parse, const char *query_string,
pps->generate_advice_feedback = generate_advice_feedback;
pps->generate_advice_string = generate_advice_string;
pps->trove = trove;
-#ifdef USE_ASSERT_CHECKING
- pps->ri_check_hash =
- pgpa_ri_check_create(CurrentMemoryContext, 1024, NULL);
-#endif
SetPlannerGlobalExtensionState(glob, planner_extension_id, pps);
}
@@ -384,7 +333,7 @@ pgpa_planner_shutdown(PlannerGlobal *glob, Query *parse,
*/
if (generate_advice_string || generate_advice_feedback)
{
- pgpa_plan_walker(&walker, pstmt, pps->sj_unique_rels);
+ pgpa_plan_walker(&walker, pstmt, pps->proots);
rt_identifiers = pgpa_create_identifiers_for_planned_stmt(pstmt);
}
@@ -592,8 +541,7 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo *joinrel,
/*
* If we're considering implementing a semijoin by making one side unique,
- * make a note of it in the pgpa_planner_state. See comments for
- * pgpa_sj_unique_rel for why we do this.
+ * make a note of it in the pgpa_planner_state.
*/
if (jointype == JOIN_UNIQUE_OUTER || jointype == JOIN_UNIQUE_INNER)
{
@@ -605,34 +553,18 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo *joinrel,
if (pps != NULL &&
(pps->generate_advice_string || pps->generate_advice_feedback))
{
- bool found = false;
+ pgpa_planner_info *proot;
/* Avoid adding duplicates. */
- foreach_ptr(pgpa_sj_unique_rel, ur, pps->sj_unique_rels)
+ proot = pgpa_planner_get_proot(pps, root);
+ if (!list_member(proot->sj_unique_rels, uniquerel->relids))
{
- /*
- * We should always use the same pointer for the same plan
- * name, so we need not use strcmp() here.
- */
- if (root->plan_name == ur->plan_name &&
- bms_equal(uniquerel->relids, ur->relids))
- {
- found = true;
- break;
- }
- }
-
- /* If not a duplicate, append to the list. */
- if (!found)
- {
- pgpa_sj_unique_rel *ur;
MemoryContext oldcontext;
+ /* Make sure to use a sufficiently long-lived context. */
oldcontext = MemoryContextSwitchTo(pps->mcxt);
- ur = palloc_object(pgpa_sj_unique_rel);
- ur->plan_name = root->plan_name;
- ur->relids = bms_copy(uniquerel->relids);
- pps->sj_unique_rels = lappend(pps->sj_unique_rels, ur);
+ proot->sj_unique_rels = lappend(proot->sj_unique_rels,
+ bms_copy(uniquerel->relids));
MemoryContextSwitchTo(oldcontext);
}
}
@@ -2017,34 +1949,64 @@ pgpa_planner_feedback_warning(List *feedback)
errdetail("%s", detailbuf.data));
}
-#ifdef USE_ASSERT_CHECKING
-
/*
- * Fast hash function for a key consisting of an RTI and plan name.
+ * Get or create the pgpa_planner_info for the subroot with the given
+ * plan_name.
*/
-static uint32
-pgpa_ri_checker_hash_key(pgpa_ri_checker_key key)
+static pgpa_planner_info *
+pgpa_planner_get_proot(pgpa_planner_state *pps, PlannerInfo *root)
{
- fasthash_state hs;
- int sp_len;
+ pgpa_planner_info *new_proot;
+
+ /*
+ * If pps->last_proot isn't populated, there are no pgpa_planner_info
+ * objects yet, so we can drop through and create a new one. Otherwise,
+ * search for an object with a matching name, and drop through only if
+ * none is found.
+ */
+ if (pps->last_proot != NULL)
+ {
+ if (root->plan_name == NULL)
+ {
+ if (pps->last_proot->plan_name == NULL)
+ return pps->last_proot;
- fasthash_init(&hs, 0);
+ foreach_ptr(pgpa_planner_info, proot, pps->proots)
+ {
+ if (proot->plan_name == NULL)
+ {
+ pps->last_proot = proot;
+ return proot;
+ }
+ }
+ }
+ else
+ {
+ if (pps->last_proot->plan_name != NULL &&
+ strcmp(pps->last_proot->plan_name, root->plan_name) == 0)
+ return pps->last_proot;
- hs.accum = key.rti;
- fasthash_combine(&hs);
+ foreach_ptr(pgpa_planner_info, proot, pps->proots)
+ {
+ if (proot->plan_name != NULL &&
+ strcmp(proot->plan_name, root->plan_name) == 0)
+ {
+ pps->last_proot = proot;
+ return proot;
+ }
+ }
+ }
+ }
- /* plan_name can be NULL */
- if (key.plan_name == NULL)
- sp_len = 0;
- else
- sp_len = fasthash_accum_cstring(&hs, key.plan_name);
+ /* Create new object, add to list, and make it most recently used. */
+ new_proot = palloc0_object(pgpa_planner_info);
+ new_proot->plan_name = root->plan_name;
+ pps->proots = lappend(pps->proots, new_proot);
+ pps->last_proot = new_proot;
- /* hashfn_unstable.h recommends using string length as tweak */
- return fasthash_final32(&hs, sp_len);
+ return new_proot;
}
-#endif
-
/*
* Save the range table identifier for one relation for future cross-checking.
*/
@@ -2053,19 +2015,34 @@ pgpa_ri_checker_save(pgpa_planner_state *pps, PlannerInfo *root,
RelOptInfo *rel)
{
#ifdef USE_ASSERT_CHECKING
- pgpa_ri_checker_key key;
- pgpa_ri_checker *check;
- pgpa_identifier rid;
- const char *rid_string;
- bool found;
-
- key.rti = bms_singleton_member(rel->relids);
- key.plan_name = root->plan_name;
- pgpa_compute_identifier_by_rti(root, key.rti, &rid);
- rid_string = pgpa_identifier_string(&rid);
- check = pgpa_ri_check_insert(pps->ri_check_hash, key, &found);
- Assert(!found || strcmp(check->rid_string, rid_string) == 0);
- check->rid_string = rid_string;
+ pgpa_planner_info *proot;
+ pgpa_identifier *rid;
+
+ /* Get the pgpa_planner_info for this PlannerInfo. */
+ proot = pgpa_planner_get_proot(pps, root);
+
+ /* Allocate or extend the proot's rid_array as necessary. */
+ if (proot->rid_array_size <= rel->relid)
+ {
+ int new_size = Max(proot->rid_array_size, 8);
+
+ while (new_size < rel->relid)
+ new_size *= 2;
+
+ if (proot->rid_array_size == 0)
+ proot->rid_array = palloc0_array(pgpa_identifier, new_size);
+ else
+ proot->rid_array = repalloc0_array(proot->rid_array,
+ pgpa_identifier,
+ proot->rid_array_size,
+ new_size);
+ proot->rid_array_size = new_size;
+ }
+
+ /* Save relation identifier details for this RTI if not already done. */
+ rid = &proot->rid_array[rel->relid - 1];
+ if (rid->alias_name == NULL)
+ pgpa_compute_identifier_by_rti(root, rel->relid, rid);
#endif
}
@@ -2078,26 +2055,22 @@ pgpa_ri_checker_validate(pgpa_planner_state *pps, PlannedStmt *pstmt)
{
#ifdef USE_ASSERT_CHECKING
pgpa_identifier *rt_identifiers;
- pgpa_ri_check_iterator it;
- pgpa_ri_checker *check;
+ Index rtable_length = list_length(pstmt->rtable);
/* Create identifiers from the planned statement. */
rt_identifiers = pgpa_create_identifiers_for_planned_stmt(pstmt);
/* Iterate over identifiers created during planning, so we can compare. */
- pgpa_ri_check_start_iterate(pps->ri_check_hash, &it);
- while ((check = pgpa_ri_check_iterate(pps->ri_check_hash, &it)) != NULL)
+ foreach_ptr(pgpa_planner_info, proot, pps->proots)
{
int rtoffset = 0;
- const char *rid_string;
- Index flat_rti;
/*
* If there's no plan name associated with this entry, then the
* rtoffset is 0. Otherwise, we can search the SubPlanRTInfo list to
* find the rtoffset.
*/
- if (check->key.plan_name != NULL)
+ if (proot->plan_name != NULL)
{
foreach_node(SubPlanRTInfo, rtinfo, pstmt->subrtinfos)
{
@@ -2109,18 +2082,8 @@ pgpa_ri_checker_validate(pgpa_planner_state *pps, PlannedStmt *pstmt)
* will be copied, as per add_rtes_to_flat_rtable. Therefore,
* there's no fixed rtoffset that we can apply to the RTIs
* used during planning to locate the corresponding relations
- * in the final rtable.
- *
- * With more complex logic, we could work around that problem
- * by remembering the whole contents of the subquery's rtable
- * during planning, determining which of those would have been
- * copied to the final rtable, and matching them up. But it
- * doesn't seem like a worthwhile endeavor for right now,
- * because RTIs from such subqueries won't appear in the plan
- * tree itself, just in the range table. Hence, we can neither
- * generate nor accept advice for them.
*/
- if (strcmp(check->key.plan_name, rtinfo->plan_name) == 0
+ if (strcmp(proot->plan_name, rtinfo->plan_name) == 0
&& !rtinfo->dummy)
{
rtoffset = rtinfo->rtoffset;
@@ -2139,17 +2102,24 @@ pgpa_ri_checker_validate(pgpa_planner_state *pps, PlannedStmt *pstmt)
continue;
}
- /*
- * check->key.rti is the RTI that we saw prior to range-table
- * flattening, so we must add the appropriate RT offset to get the
- * final RTI.
- */
- flat_rti = check->key.rti + rtoffset;
- Assert(flat_rti <= list_length(pstmt->rtable));
+ for (int rti = 1; rti <= proot->rid_array_size; ++rti)
+ {
+ Index flat_rti = rtoffset + rti;
+ pgpa_identifier *rid1 = &proot->rid_array[rti - 1];
+ pgpa_identifier *rid2;
+
+ if (rid1->alias_name == NULL)
+ continue;
- /* Assert that the string we compute now matches the previous one. */
- rid_string = pgpa_identifier_string(&rt_identifiers[flat_rti - 1]);
- Assert(strcmp(rid_string, check->rid_string) == 0);
+ Assert(flat_rti <= rtable_length);
+ rid2 = &rt_identifiers[flat_rti - 1];
+ Assert(strcmp(rid1->alias_name, rid2->alias_name) == 0);
+ Assert(rid1->occurrence == rid2->occurrence);
+ Assert(strings_equal_or_both_null(rid1->partnsp, rid2->partnsp));
+ Assert(strings_equal_or_both_null(rid1->partrel, rid2->partrel));
+ Assert(strings_equal_or_both_null(rid1->plan_name,
+ rid2->plan_name));
+ }
}
#endif
}
diff --git a/contrib/pg_plan_advice/pgpa_planner.h b/contrib/pg_plan_advice/pgpa_planner.h
index c70e486a7f3..e9045f69bca 100644
--- a/contrib/pg_plan_advice/pgpa_planner.h
+++ b/contrib/pg_plan_advice/pgpa_planner.h
@@ -12,8 +12,46 @@
#ifndef PGPA_PLANNER_H
#define PGPA_PLANNER_H
+#include "pgpa_identifier.h"
+
extern void pgpa_planner_install_hooks(void);
+/*
+ * Per-PlannerInfo information that we gather during query planning.
+ */
+typedef struct pgpa_planner_info
+{
+ /* Plan name taken from the corresponding PlannerInfo; NULL at top level. */
+ char *plan_name;
+
+#ifdef USE_ASSERT_CHECKING
+ /* Relation identifiers computed for baserels at this query level. */
+ pgpa_identifier *rid_array;
+ int rid_array_size;
+#endif
+
+ /*
+ * List of Bitmapset objects. Each represents the relid set of a relation
+ * that the planner considers making unique during semijoin planning.
+ *
+ * When generating advice, we should emit either SEMIJOIN_UNIQUE advice or
+ * SEMIJOIN_NON_UNIQUE advice for each semijoin depending on whether we
+ * chose to implement it as a semijoin or whether we instead chose to make
+ * the nullable side unique and then perform an inner join. When the
+ * make-unique strategy is not chosen, it's not easy to tell from the
+ * final plan tree whether it was considered. That's awkward, because we
+ * don't want to emit useless SEMIJOIN_NON_UNIQUE advice when there was no
+ * decision to be made. This list lets the plan tree walker know in which
+ * cases that approach was considered, so that it doesn't have to guess.
+ */
+ List *sj_unique_rels;
+} pgpa_planner_info;
+
+/*
+ * When set to a value greater than zero, indicates that advice should be
+ * generated during query planning even in the absence of obvious reasons to
+ * do so. See pg_plan_advice_request_advice_generation().
+ */
extern int pgpa_planner_generate_advice;
#endif
diff --git a/contrib/pg_plan_advice/pgpa_walker.c b/contrib/pg_plan_advice/pgpa_walker.c
index 7b86cc5e6f9..6fbc784bf54 100644
--- a/contrib/pg_plan_advice/pgpa_walker.c
+++ b/contrib/pg_plan_advice/pgpa_walker.c
@@ -12,6 +12,7 @@
#include "postgres.h"
#include "pgpa_join.h"
+#include "pgpa_planner.h"
#include "pgpa_scan.h"
#include "pgpa_walker.h"
@@ -64,12 +65,12 @@ static bool pgpa_walker_contains_no_gather(pgpa_plan_walker_context *walker,
*
* Populates walker based on a traversal of the Plan trees in pstmt.
*
- * sj_unique_rels is a list of pgpa_sj_unique_rel objects, one for each
- * relation we considered making unique as part of semijoin planning.
+ * proots is the list of pgpa_planner_info objects that were generated
+ * during planning.
*/
void
pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
- List *sj_unique_rels)
+ List *proots)
{
ListCell *lc;
List *sj_unique_rtis = NULL;
@@ -92,19 +93,21 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
}
/* Adjust RTIs from sj_unique_rels for the flattened range table. */
- foreach_ptr(pgpa_sj_unique_rel, ur, sj_unique_rels)
+ foreach_ptr(pgpa_planner_info, proot, proots)
{
- int rtindex = -1;
int rtoffset = 0;
bool dummy = false;
- Bitmapset *relids = NULL;
+
+ /* If there are no sj_unique_rels for this proot, we can skip it. */
+ if (proot->sj_unique_rels == NIL)
+ continue;
/* If this is a subplan, find the range table offset. */
- if (ur->plan_name != NULL)
+ if (proot->plan_name != NULL)
{
foreach_node(SubPlanRTInfo, rtinfo, pstmt->subrtinfos)
{
- if (strcmp(ur->plan_name, rtinfo->plan_name) == 0)
+ if (strcmp(proot->plan_name, rtinfo->plan_name) == 0)
{
rtoffset = rtinfo->rtoffset;
dummy = rtinfo->dummy;
@@ -113,19 +116,24 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
}
if (rtoffset == 0)
- elog(ERROR, "no rtoffset for plan %s", ur->plan_name);
+ elog(ERROR, "no rtoffset for plan %s", proot->plan_name);
}
/* If this entry pertains to a dummy subquery, ignore it. */
if (dummy)
continue;
- /* Offset each entry from the original set. */
- while ((rtindex = bms_next_member(ur->relids, rtindex)) >= 0)
- relids = bms_add_member(relids, rtindex + rtoffset);
+ /* Offset each relid set by the rtoffset we just computed. */
+ foreach_node(Bitmapset, relids, proot->sj_unique_rels)
+ {
+ int rtindex = -1;
+ Bitmapset *flat_relids = NULL;
- /* Store the resulting set. */
- sj_unique_rtis = lappend(sj_unique_rtis, relids);
+ while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+ flat_relids = bms_add_member(flat_relids, rtindex + rtoffset);
+
+ sj_unique_rtis = lappend(sj_unique_rtis, flat_relids);
+ }
}
/*
diff --git a/contrib/pg_plan_advice/pgpa_walker.h b/contrib/pg_plan_advice/pgpa_walker.h
index 4890d554dd3..9b74cd3ba55 100644
--- a/contrib/pg_plan_advice/pgpa_walker.h
+++ b/contrib/pg_plan_advice/pgpa_walker.h
@@ -16,24 +16,6 @@
#include "pgpa_join.h"
#include "pgpa_scan.h"
-/*
- * When generating advice, we should emit either SEMIJOIN_UNIQUE advice or
- * SEMIJOIN_NON_UNIQUE advice for each semijoin depending on whether we chose
- * to implement it as a semijoin or whether we instead chose to make the
- * nullable side unique and then perform an inner join. When the make-unique
- * strategy is not chosen, it's not easy to tell from the final plan tree
- * whether it was considered. That's awkward, because we don't want to emit
- * useless SEMIJOIN_NON_UNIQUE advice when there was no decision to be made.
- *
- * To avoid that, during planning, we create a pgpa_sj_unique_rel for each
- * relation that we considered making unique for purposes of semijoin planning.
- */
-typedef struct pgpa_sj_unique_rel
-{
- char *plan_name;
- Bitmapset *relids;
-} pgpa_sj_unique_rel;
-
/*
* We use the term "query feature" to refer to plan nodes that are interesting
* in the following way: to generate advice, we'll need to know the set of
@@ -122,7 +104,7 @@ typedef struct pgpa_plan_walker_context
extern void pgpa_plan_walker(pgpa_plan_walker_context *walker,
PlannedStmt *pstmt,
- List *sj_unique_rels);
+ List *proots);
extern void pgpa_add_future_feature(pgpa_plan_walker_context *walker,
pgpa_qf_type type,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8df23840e57..12f4b8a7bf8 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4034,14 +4034,12 @@ pgpa_join_strategy
pgpa_join_unroller
pgpa_output_context
pgpa_plan_walker_context
+pgpa_planner_info
pgpa_planner_state
pgpa_qf_type
pgpa_query_feature
-pgpa_ri_checker
-pgpa_ri_checker_key
pgpa_scan
pgpa_scan_strategy
-pgpa_sj_unique_rel
pgpa_target_type
pgpa_trove
pgpa_trove_entry
--
2.51.0
[application/octet-stream] v22-0004-pg_plan_advice-Invent-DO_NOT_SCAN-relation_ident.patch (43.4K, 4-v22-0004-pg_plan_advice-Invent-DO_NOT_SCAN-relation_ident.patch)
download | inline diff:
From 0e58860cf50fdd08712bfd99326686c89a6f3312 Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Tue, 24 Mar 2026 08:40:54 -0400
Subject: [PATCH v22 4/6] pg_plan_advice: Invent
DO_NOT_SCAN(relation_identifier).
The premise of src/test/modules/test_plan_advice is that if we plan
a query once, generate plan advice, and then replan it using that
same advice, all of that advice should apply cleanly, since the
settings and everything else are the same. Unfortunately, that's
not the case: the test suite is the main regression tests, and
concurrent activity can change the statistics on tables involved
in the query, especially system catalogs. That's OK as long as it
only affects costing, but in a few cases, it affects which relations
appear in the final plan at all.
In the buildfarm failures observed to date, this happens because
we consider alternative subplans for the same portion of the query;
in theory, MinMaxAggPath is vulnerable to a similar hazard. In both
cases, the planner clones an entire subquery, and the clone has a
different plan name, and therefore different range table identifiers,
than the original. If a cost change results in flipping between one
of these plans and the other, the test_plan_advice tests will fail,
because the range table identifiers to which advice was applied won't
even be present in the output of the second planning cycle.
To fix, invent a new DO_NOT_SCAN advice tag. When generating advice,
emit it for relations that should not appear in the final plan at
all, because some alternative version of that relation was used
instead. When DO_NOT_SCAN is supplied, disable all scan methods for
that relation.
To make this work, we reuse a bunch of the machinery that previously
existed for the purpose of ensuring that we build the same set of
relation identifiers during planning as we do from the final
PlannedStmt. In the process, this commit slightly weakens the
cross-check mechanism: before this commit, it would fire whenever
the pg_plan_advice module was loaded, even if pg_plan_advice wasn't
actually doing anything; now, it will only engage when we have some
other reason to create a pgpa_planner_state. The old way was complex
and didn't add much useful test coverage, so this seems like an
acceptable sacrifice.
---
contrib/pg_plan_advice/README | 7 +
.../pg_plan_advice/expected/alternatives.out | 158 +++++++++++++
contrib/pg_plan_advice/expected/scan.out | 17 +-
contrib/pg_plan_advice/meson.build | 1 +
contrib/pg_plan_advice/pgpa_ast.c | 6 +
contrib/pg_plan_advice/pgpa_ast.h | 1 +
contrib/pg_plan_advice/pgpa_output.c | 35 +++
contrib/pg_plan_advice/pgpa_planner.c | 216 ++++++++++--------
contrib/pg_plan_advice/pgpa_planner.h | 26 ++-
contrib/pg_plan_advice/pgpa_trove.c | 1 +
contrib/pg_plan_advice/pgpa_walker.c | 156 +++++++++++--
contrib/pg_plan_advice/pgpa_walker.h | 1 +
contrib/pg_plan_advice/sql/alternatives.sql | 58 +++++
contrib/pg_plan_advice/sql/scan.sql | 5 +-
doc/src/sgml/pgplanadvice.sgml | 14 +-
15 files changed, 584 insertions(+), 118 deletions(-)
create mode 100644 contrib/pg_plan_advice/expected/alternatives.out
create mode 100644 contrib/pg_plan_advice/sql/alternatives.sql
diff --git a/contrib/pg_plan_advice/README b/contrib/pg_plan_advice/README
index b0e4fd1d6e1..2ea61e9bc41 100644
--- a/contrib/pg_plan_advice/README
+++ b/contrib/pg_plan_advice/README
@@ -109,6 +109,13 @@ Bitmap heap scans currently do not allow for an index specification:
BITMAP_HEAP_SCAN(foo bar) simply means that each of foo and bar should use
some sort of bitmap heap scan.
+There is a special DO_NOT_SCAN() advice tag which says that a certain
+relation shouldn't be scanned at all. This is used to control which of
+two choices is selected when an AlternativeSubPlan is resolved, and
+whether or not a MinMaxAggPath is chosen. Control over upper planner
+behavior is generally out-of-scope at the moment, but these cases had
+to be handled to prevent test_plan_advice failures in the buildfarm.
+
Join Order Advice
=================
diff --git a/contrib/pg_plan_advice/expected/alternatives.out b/contrib/pg_plan_advice/expected/alternatives.out
new file mode 100644
index 00000000000..a6fb296d4b4
--- /dev/null
+++ b/contrib/pg_plan_advice/expected/alternatives.out
@@ -0,0 +1,158 @@
+LOAD 'pg_plan_advice';
+SET max_parallel_workers_per_gather = 0;
+CREATE TABLE alt_t1 (a int) WITH (autovacuum_enabled = false);
+CREATE TABLE alt_t2 (a int) WITH (autovacuum_enabled = false);
+CREATE INDEX ON alt_t2(a);
+INSERT INTO alt_t1 SELECT generate_series(1, 1000);
+INSERT INTO alt_t2 SELECT generate_series(1, 100000);
+VACUUM ANALYZE alt_t1;
+VACUUM ANALYZE alt_t2;
+-- This query uses an OR to prevent the EXISTS from being converted to a
+-- semi-join, forcing the planner through the AlternativeSubPlan path.
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+ QUERY PLAN
+-------------------------------------------------------------------
+ Seq Scan on alt_t1
+ Filter: ((ANY (a = (hashed SubPlan exists_2).col1)) OR (a < 0))
+ SubPlan exists_2
+ -> Seq Scan on alt_t2
+ Generated Plan Advice:
+ SEQ_SCAN(alt_t1 alt_t2@exists_2)
+ NO_GATHER(alt_t1 alt_t2@exists_2)
+ DO_NOT_SCAN(alt_t2@exists_1)
+(8 rows)
+
+-- We should be able to force either AlternativeSubPlan by advising against
+-- scanning the other relation.
+BEGIN;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_t2@exists_1)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+ QUERY PLAN
+-------------------------------------------------------------------
+ Seq Scan on alt_t1
+ Filter: ((ANY (a = (hashed SubPlan exists_2).col1)) OR (a < 0))
+ SubPlan exists_2
+ -> Seq Scan on alt_t2
+ Supplied Plan Advice:
+ DO_NOT_SCAN(alt_t2@exists_1) /* matched */
+ Generated Plan Advice:
+ SEQ_SCAN(alt_t1 alt_t2@exists_2)
+ NO_GATHER(alt_t1 alt_t2@exists_2)
+ DO_NOT_SCAN(alt_t2@exists_1)
+(10 rows)
+
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_t2@exists_2)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+ QUERY PLAN
+--------------------------------------------------------
+ Seq Scan on alt_t1
+ Filter: (EXISTS(SubPlan exists_1) OR (a < 0))
+ SubPlan exists_1
+ -> Index Only Scan using alt_t2_a_idx on alt_t2
+ Index Cond: (a = alt_t1.a)
+ Supplied Plan Advice:
+ DO_NOT_SCAN(alt_t2@exists_2) /* matched */
+ Generated Plan Advice:
+ SEQ_SCAN(alt_t1)
+ INDEX_ONLY_SCAN(alt_t2@exists_1 public.alt_t2_a_idx)
+ NO_GATHER(alt_t1 alt_t2@exists_1)
+ DO_NOT_SCAN(alt_t2@exists_2)
+(12 rows)
+
+COMMIT;
+-- Now let's test a case involving MinMaxAggPath, which we treat similarly
+-- to the AlternativeSubPlan case.
+CREATE TABLE alt_minmax (a int) WITH (autovacuum_enabled = false);
+CREATE INDEX ON alt_minmax(a);
+INSERT INTO alt_minmax SELECT generate_series(1, 10000);
+VACUUM ANALYZE alt_minmax;
+-- Using an Index Scan inside of an InitPlan should win over a full table
+-- scan.
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+ QUERY PLAN
+------------------------------------------------------------------------------------------
+ Result
+ Replaces: MinMaxAggregate
+ InitPlan minmax_1
+ -> Limit
+ -> Index Only Scan using alt_minmax_a_idx on alt_minmax
+ Index Cond: (a IS NOT NULL)
+ InitPlan minmax_2
+ -> Limit
+ -> Index Only Scan Backward using alt_minmax_a_idx on alt_minmax alt_minmax_1
+ Index Cond: (a IS NOT NULL)
+ Generated Plan Advice:
+ INDEX_ONLY_SCAN(alt_minmax@minmax_1 public.alt_minmax_a_idx
+ alt_minmax@minmax_2 public.alt_minmax_a_idx)
+ NO_GATHER(alt_minmax@minmax_1 alt_minmax@minmax_2)
+ DO_NOT_SCAN(alt_minmax)
+(15 rows)
+
+-- Advising against the scan of alt_minmax at the root query level should
+-- change nothing, but if we say we don't want either of or both of the
+-- minmax-variant scans, the plan should switch to a full table scan.
+BEGIN;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+ QUERY PLAN
+------------------------------------------------------------------------------------------
+ Result
+ Replaces: MinMaxAggregate
+ InitPlan minmax_1
+ -> Limit
+ -> Index Only Scan using alt_minmax_a_idx on alt_minmax
+ Index Cond: (a IS NOT NULL)
+ InitPlan minmax_2
+ -> Limit
+ -> Index Only Scan Backward using alt_minmax_a_idx on alt_minmax alt_minmax_1
+ Index Cond: (a IS NOT NULL)
+ Supplied Plan Advice:
+ DO_NOT_SCAN(alt_minmax) /* matched */
+ Generated Plan Advice:
+ INDEX_ONLY_SCAN(alt_minmax@minmax_1 public.alt_minmax_a_idx
+ alt_minmax@minmax_2 public.alt_minmax_a_idx)
+ NO_GATHER(alt_minmax@minmax_1 alt_minmax@minmax_2)
+ DO_NOT_SCAN(alt_minmax)
+(17 rows)
+
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax@minmax_1)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+ QUERY PLAN
+--------------------------------------------------------
+ Aggregate
+ -> Seq Scan on alt_minmax
+ Supplied Plan Advice:
+ DO_NOT_SCAN(alt_minmax@minmax_1) /* matched */
+ Generated Plan Advice:
+ SEQ_SCAN(alt_minmax)
+ NO_GATHER(alt_minmax)
+ DO_NOT_SCAN(alt_minmax@minmax_1 alt_minmax@minmax_2)
+(8 rows)
+
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax@minmax_1) DO_NOT_SCAN(alt_minmax@minmax_2)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+ QUERY PLAN
+--------------------------------------------------------
+ Aggregate
+ -> Seq Scan on alt_minmax
+ Supplied Plan Advice:
+ DO_NOT_SCAN(alt_minmax@minmax_1) /* matched */
+ DO_NOT_SCAN(alt_minmax@minmax_2) /* matched */
+ Generated Plan Advice:
+ SEQ_SCAN(alt_minmax)
+ NO_GATHER(alt_minmax)
+ DO_NOT_SCAN(alt_minmax@minmax_1 alt_minmax@minmax_2)
+(9 rows)
+
+COMMIT;
+DROP TABLE alt_t1, alt_t2, alt_minmax;
diff --git a/contrib/pg_plan_advice/expected/scan.out b/contrib/pg_plan_advice/expected/scan.out
index 3f9e13b6d41..44ce40f33a6 100644
--- a/contrib/pg_plan_advice/expected/scan.out
+++ b/contrib/pg_plan_advice/expected/scan.out
@@ -270,7 +270,8 @@ EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a > 0;
COMMIT;
-- We can force a primary key lookup to use a sequential scan, but we
-- can't force it to use an index-only scan (due to the column list)
--- or a TID scan (due to the absence of a TID qual).
+-- or a TID scan (due to the absence of a TID qual). If we apply DO_NOT_SCAN
+-- here, we should get a valid plan anyway, but with the scan disabled.
BEGIN;
SET LOCAL pg_plan_advice.advice = 'SEQ_SCAN(scan_table)';
EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
@@ -313,6 +314,20 @@ EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
NO_GATHER(scan_table)
(8 rows)
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(scan_table)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
+ QUERY PLAN
+-------------------------------------------------
+ Index Scan using scan_table_pkey on scan_table
+ Disabled: true
+ Index Cond: (a = 1)
+ Supplied Plan Advice:
+ DO_NOT_SCAN(scan_table) /* matched, failed */
+ Generated Plan Advice:
+ INDEX_SCAN(scan_table public.scan_table_pkey)
+ NO_GATHER(scan_table)
+(8 rows)
+
COMMIT;
-- We can forcibly downgrade an index-only scan to an index scan, but we can't
-- force the use of an index that the planner thinks is inapplicable.
diff --git a/contrib/pg_plan_advice/meson.build b/contrib/pg_plan_advice/meson.build
index 36bbc4e9826..f2098947b64 100644
--- a/contrib/pg_plan_advice/meson.build
+++ b/contrib/pg_plan_advice/meson.build
@@ -53,6 +53,7 @@ tests += {
'bd': meson.current_build_dir(),
'regress': {
'sql': [
+ 'alternatives',
'gather',
'join_order',
'join_strategy',
diff --git a/contrib/pg_plan_advice/pgpa_ast.c b/contrib/pg_plan_advice/pgpa_ast.c
index f4fa6a626d4..3c340c6ae7a 100644
--- a/contrib/pg_plan_advice/pgpa_ast.c
+++ b/contrib/pg_plan_advice/pgpa_ast.c
@@ -32,6 +32,8 @@ pgpa_cstring_advice_tag(pgpa_advice_tag_type advice_tag)
{
case PGPA_TAG_BITMAP_HEAP_SCAN:
return "BITMAP_HEAP_SCAN";
+ case PGPA_TAG_DO_NOT_SCAN:
+ return "DO_NOT_SCAN";
case PGPA_TAG_FOREIGN_JOIN:
return "FOREIGN_JOIN";
case PGPA_TAG_GATHER:
@@ -92,6 +94,10 @@ pgpa_parse_advice_tag(const char *tag, bool *fail)
if (strcmp(tag, "bitmap_heap_scan") == 0)
return PGPA_TAG_BITMAP_HEAP_SCAN;
break;
+ case 'd':
+ if (strcmp(tag, "do_not_scan") == 0)
+ return PGPA_TAG_DO_NOT_SCAN;
+ break;
case 'f':
if (strcmp(tag, "foreign_join") == 0)
return PGPA_TAG_FOREIGN_JOIN;
diff --git a/contrib/pg_plan_advice/pgpa_ast.h b/contrib/pg_plan_advice/pgpa_ast.h
index 3c3db801926..a89f1251929 100644
--- a/contrib/pg_plan_advice/pgpa_ast.h
+++ b/contrib/pg_plan_advice/pgpa_ast.h
@@ -80,6 +80,7 @@ typedef struct pgpa_advice_target
typedef enum pgpa_advice_tag_type
{
PGPA_TAG_BITMAP_HEAP_SCAN,
+ PGPA_TAG_DO_NOT_SCAN,
PGPA_TAG_FOREIGN_JOIN,
PGPA_TAG_GATHER,
PGPA_TAG_GATHER_MERGE,
diff --git a/contrib/pg_plan_advice/pgpa_output.c b/contrib/pg_plan_advice/pgpa_output.c
index 28d2839ce1a..cd4411f350c 100644
--- a/contrib/pg_plan_advice/pgpa_output.c
+++ b/contrib/pg_plan_advice/pgpa_output.c
@@ -54,6 +54,8 @@ static void pgpa_output_simple_strategy(pgpa_output_context *context,
List *relid_sets);
static void pgpa_output_no_gather(pgpa_output_context *context,
Bitmapset *relids);
+static void pgpa_output_do_not_scan(pgpa_output_context *context,
+ List *identifiers);
static void pgpa_output_relations(pgpa_output_context *context, StringInfo buf,
Bitmapset *relids);
@@ -156,6 +158,9 @@ pgpa_output_advice(StringInfo buf, pgpa_plan_walker_context *walker,
/* Emit NO_GATHER advice. */
pgpa_output_no_gather(&context, walker->no_gather_scans);
+
+ /* Emit DO_NOT_SCAN advice. */
+ pgpa_output_do_not_scan(&context, walker->do_not_scan_identifiers);
}
/*
@@ -395,6 +400,36 @@ pgpa_output_no_gather(pgpa_output_context *context, Bitmapset *relids)
appendStringInfoChar(context->buf, ')');
}
+/*
+ * Output DO_NOT_SCAN advice for all relations in the provided list of
+ * identifiers.
+ */
+static void
+pgpa_output_do_not_scan(pgpa_output_context *context, List *identifiers)
+{
+ bool first = true;
+
+ if (identifiers == NIL)
+ return;
+ if (context->buf->len > 0)
+ appendStringInfoChar(context->buf, '\n');
+ appendStringInfoString(context->buf, "DO_NOT_SCAN(");
+
+ foreach_ptr(pgpa_identifier, rid, identifiers)
+ {
+ if (first)
+ first = false;
+ else
+ {
+ pgpa_maybe_linebreak(context->buf, context->wrap_column);
+ appendStringInfoChar(context->buf, ' ');
+ }
+ appendStringInfoString(context->buf, pgpa_identifier_string(rid));
+ }
+
+ appendStringInfoChar(context->buf, ')');
+}
+
/*
* Output the identifiers for each RTI in the provided set.
*
diff --git a/contrib/pg_plan_advice/pgpa_planner.c b/contrib/pg_plan_advice/pgpa_planner.c
index 70139ff42be..751a255615a 100644
--- a/contrib/pg_plan_advice/pgpa_planner.c
+++ b/contrib/pg_plan_advice/pgpa_planner.c
@@ -164,11 +164,13 @@ static void pgpa_planner_feedback_warning(List *feedback);
static pgpa_planner_info *pgpa_planner_get_proot(pgpa_planner_state *pps,
PlannerInfo *root);
-static inline void pgpa_ri_checker_save(pgpa_planner_state *pps,
- PlannerInfo *root,
- RelOptInfo *rel);
-static void pgpa_ri_checker_validate(pgpa_planner_state *pps,
- PlannedStmt *pstmt);
+static inline void pgpa_compute_rt_identifier(pgpa_planner_info *proot,
+ PlannerInfo *root,
+ RelOptInfo *rel);
+static void pgpa_compute_rt_offsets(pgpa_planner_state *pps,
+ PlannedStmt *pstmt);
+static void pgpa_validate_rt_identifiers(pgpa_planner_state *pps,
+ PlannedStmt *pstmt);
static char *pgpa_bms_to_cstring(Bitmapset *bms);
static const char *pgpa_jointype_to_cstring(JoinType jointype);
@@ -264,20 +266,10 @@ pgpa_planner_setup(PlannerGlobal *glob, Query *parse, const char *query_string,
}
}
-#ifdef USE_ASSERT_CHECKING
-
- /*
- * If asserts are enabled, always build a private state object for
- * cross-checks.
- */
- needs_pps = true;
-#endif
-
/*
* We only create and initialize a private state object if it's needed for
* some purpose. That could be (1) recording that we will need to generate
- * an advice string, (2) storing a trove of supplied advice, or (3)
- * facilitating debugging cross-checks when asserts are enabled.
+ * an advice string or (2) storing a trove of supplied advice.
*
* Currently, the active memory context should be one that will last for
* the entire duration of query planning, but if GEQO is in use, it's
@@ -321,9 +313,16 @@ pgpa_planner_shutdown(PlannerGlobal *glob, Query *parse,
pps = GetPlannerGlobalExtensionState(glob, planner_extension_id);
if (pps != NULL)
{
+ /* Set up some local variables. */
trove = pps->trove;
generate_advice_feedback = pps->generate_advice_feedback;
generate_advice_string = pps->generate_advice_string;
+
+ /* Compute range table offsets. */
+ pgpa_compute_rt_offsets(pps, pstmt);
+
+ /* Cross-check range table identifiers. */
+ pgpa_validate_rt_identifiers(pps, pstmt);
}
/*
@@ -394,13 +393,6 @@ pgpa_planner_shutdown(PlannerGlobal *glob, Query *parse,
lappend(pstmt->extension_state,
makeDefElem("pg_plan_advice", (Node *) pgpa_items, -1));
- /*
- * If assertions are enabled, cross-check the generated range table
- * identifiers.
- */
- if (pps != NULL)
- pgpa_ri_checker_validate(pps, pstmt);
-
/* Pass call to previous hook. */
if (prev_planner_shutdown)
(*prev_planner_shutdown) (glob, parse, query_string, pstmt);
@@ -408,35 +400,38 @@ pgpa_planner_shutdown(PlannerGlobal *glob, Query *parse,
/*
* Hook function for build_simple_rel().
- *
- * We can apply scan advice at this point, and we also use this as an
- * opportunity to do range-table identifier cross-checking in assert-enabled
- * builds.
*/
static void
pgpa_build_simple_rel(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
{
pgpa_planner_state *pps;
+ pgpa_planner_info *proot = NULL;
/* Fetch our private state, set up by pgpa_planner_setup(). */
pps = GetPlannerGlobalExtensionState(root->glob, planner_extension_id);
- /* Save details needed for range table identifier cross-checking. */
+ /*
+ * Look up the pgpa_planner_info for this subquery, and make sure we've
+ * saved a range table identifier.
+ */
if (pps != NULL)
- pgpa_ri_checker_save(pps, root, rel);
+ {
+ proot = pgpa_planner_get_proot(pps, root);
+ pgpa_compute_rt_identifier(proot, root, rel);
+ }
/* If query advice was provided, search for relevant entries. */
if (pps != NULL && pps->trove != NULL)
{
- pgpa_identifier rid;
+ pgpa_identifier *rid;
pgpa_trove_result tresult_scan;
pgpa_trove_result tresult_rel;
/* Search for scan advice and general rel advice. */
- pgpa_compute_identifier_by_rti(root, rel->relid, &rid);
- pgpa_trove_lookup(pps->trove, PGPA_TROVE_LOOKUP_SCAN, 1, &rid,
+ rid = &proot->rid_array[rel->relid - 1];
+ pgpa_trove_lookup(pps->trove, PGPA_TROVE_LOOKUP_SCAN, 1, rid,
&tresult_scan);
- pgpa_trove_lookup(pps->trove, PGPA_TROVE_LOOKUP_REL, 1, &rid,
+ pgpa_trove_lookup(pps->trove, PGPA_TROVE_LOOKUP_REL, 1, rid,
&tresult_rel);
/* If relevant entries were found, apply them. */
@@ -1622,6 +1617,8 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
pgpa_trove_entry *rel_entries,
Bitmapset *rel_indexes)
{
+ const uint64 all_scan_mask = PGS_SCAN_ANY | PGS_APPEND |
+ PGS_MERGE_APPEND | PGS_CONSIDER_INDEXONLY;
bool gather_conflict = false;
Bitmapset *gather_partial_match = NULL;
Bitmapset *gather_full_match = NULL;
@@ -1632,16 +1629,18 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
Bitmapset *scan_type_indexes = NULL;
Bitmapset *scan_type_rel_indexes = NULL;
uint64 gather_mask = 0;
- uint64 scan_type = 0;
+ uint64 scan_type = all_scan_mask; /* sentinel: no advice yet */
/* Scrutinize available scan advice. */
while ((i = bms_next_member(scan_indexes, i)) >= 0)
{
pgpa_trove_entry *my_entry = &scan_entries[i];
- uint64 my_scan_type = 0;
+ uint64 my_scan_type = all_scan_mask;
/* Translate our advice tags to a scan strategy advice value. */
- if (my_entry->tag == PGPA_TAG_BITMAP_HEAP_SCAN)
+ if (my_entry->tag == PGPA_TAG_DO_NOT_SCAN)
+ my_scan_type = 0;
+ else if (my_entry->tag == PGPA_TAG_BITMAP_HEAP_SCAN)
{
/*
* Currently, PGS_CONSIDER_INDEXONLY can suppress Bitmap Heap
@@ -1675,9 +1674,9 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
* INDEX_SCAN(a b.c) as non-conflicting if it happens that the only
* index named c is in schema b, but it doesn't seem worth the code.
*/
- if (my_scan_type != 0)
+ if (my_scan_type != all_scan_mask)
{
- if (scan_type != 0 && scan_type != my_scan_type)
+ if (scan_type != all_scan_mask && scan_type != my_scan_type)
scan_type_conflict = true;
if (!scan_type_conflict && scan_entry != NULL &&
my_entry->target->itarget != NULL &&
@@ -1712,7 +1711,7 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
{
const uint64 my_scan_type = PGS_APPEND | PGS_MERGE_APPEND;
- if (scan_type != 0 && scan_type != my_scan_type)
+ if (scan_type != all_scan_mask && scan_type != my_scan_type)
scan_type_conflict = true;
scan_entry = my_entry;
scan_type = my_scan_type;
@@ -1791,7 +1790,7 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
if (matched_index == NULL)
{
/* Don't force the scan type if the index doesn't exist. */
- scan_type = 0;
+ scan_type = all_scan_mask;
/* Mark advice as inapplicable. */
pgpa_trove_set_flags(scan_entries, scan_type_indexes,
@@ -1835,14 +1834,8 @@ pgpa_planner_apply_scan_advice(RelOptInfo *rel,
* Only clear bits here, so that we still respect the enable_* GUCs. Do
* nothing in cases where the advice on a single topic conflicts.
*/
- if (scan_type != 0 && !scan_type_conflict)
- {
- uint64 all_scan_mask;
-
- all_scan_mask = PGS_SCAN_ANY | PGS_APPEND | PGS_MERGE_APPEND |
- PGS_CONSIDER_INDEXONLY;
+ if (scan_type != all_scan_mask && !scan_type_conflict)
rel->pgs_mask &= ~(all_scan_mask & ~scan_type);
- }
if (gather_mask != 0 && !gather_conflict)
{
uint64 all_gather_mask;
@@ -1998,9 +1991,41 @@ pgpa_planner_get_proot(pgpa_planner_state *pps, PlannerInfo *root)
}
}
- /* Create new object, add to list, and make it most recently used. */
+ /* Create new object. */
new_proot = palloc0_object(pgpa_planner_info);
+
+ /* Set plan name and alternative plan name. */
new_proot->plan_name = root->plan_name;
+ if (root->alternative_root == NULL)
+ new_proot->alternative_plan_name = root->plan_name;
+ else
+ new_proot->alternative_plan_name = root->alternative_root->plan_name;
+
+ /*
+ * If the newly-created proot shares an alternative_plan_name with one or
+ * more others, all should have the is_alternative_plan flag set.
+ */
+ foreach_ptr(pgpa_planner_info, other_proot, pps->proots)
+ {
+ if (strings_equal_or_both_null(new_proot->alternative_plan_name,
+ other_proot->alternative_plan_name))
+ {
+ new_proot->is_alternative_plan = true;
+ other_proot->is_alternative_plan = true;
+ }
+ }
+
+ /*
+ * Outermost query level always has rtoffset 0; other rtoffset values are
+ * computed later.
+ */
+ if (root->plan_name == NULL)
+ {
+ new_proot->has_rtoffset = true;
+ new_proot->rtoffset = 0;
+ }
+
+ /* Add to list and make it most recently used. */
pps->proots = lappend(pps->proots, new_proot);
pps->last_proot = new_proot;
@@ -2008,19 +2033,15 @@ pgpa_planner_get_proot(pgpa_planner_state *pps, PlannerInfo *root)
}
/*
- * Save the range table identifier for one relation for future cross-checking.
+ * Compute the range table identifier for one relation and save it for future
+ * use.
*/
static void
-pgpa_ri_checker_save(pgpa_planner_state *pps, PlannerInfo *root,
- RelOptInfo *rel)
+pgpa_compute_rt_identifier(pgpa_planner_info *proot, PlannerInfo *root,
+ RelOptInfo *rel)
{
-#ifdef USE_ASSERT_CHECKING
- pgpa_planner_info *proot;
pgpa_identifier *rid;
- /* Get the pgpa_planner_info for this PlannerInfo. */
- proot = pgpa_planner_get_proot(pps, root);
-
/* Allocate or extend the proot's rid_array as necessary. */
if (proot->rid_array_size <= rel->relid)
{
@@ -2043,36 +2064,32 @@ pgpa_ri_checker_save(pgpa_planner_state *pps, PlannerInfo *root,
rid = &proot->rid_array[rel->relid - 1];
if (rid->alias_name == NULL)
pgpa_compute_identifier_by_rti(root, rel->relid, rid);
-#endif
}
/*
- * Validate that the range table identifiers we were able to generate during
- * planning match the ones we generated from the final plan.
+ * Compute the range table offset for each pgpa_planner_state for which it
+ * is possible to meaningfully do so.
*/
static void
-pgpa_ri_checker_validate(pgpa_planner_state *pps, PlannedStmt *pstmt)
+pgpa_compute_rt_offsets(pgpa_planner_state *pps, PlannedStmt *pstmt)
{
-#ifdef USE_ASSERT_CHECKING
- pgpa_identifier *rt_identifiers;
- Index rtable_length = list_length(pstmt->rtable);
-
- /* Create identifiers from the planned statement. */
- rt_identifiers = pgpa_create_identifiers_for_planned_stmt(pstmt);
-
- /* Iterate over identifiers created during planning, so we can compare. */
foreach_ptr(pgpa_planner_info, proot, pps->proots)
{
- int rtoffset = 0;
+ /* For the top query level, we've previously set rtoffset 0. */
+ if (proot->plan_name == NULL)
+ {
+ Assert(proot->has_rtoffset);
+ continue;
+ }
/*
- * If there's no plan name associated with this entry, then the
- * rtoffset is 0. Otherwise, we can search the SubPlanRTInfo list to
- * find the rtoffset.
+ * It's not guaranteed that every plan name we saw during planning has
+ * a SubPlanInfo, but any that do not certainly don't appear in the
+ * final range table.
*/
- if (proot->plan_name != NULL)
+ foreach_node(SubPlanRTInfo, rtinfo, pstmt->subrtinfos)
{
- foreach_node(SubPlanRTInfo, rtinfo, pstmt->subrtinfos)
+ if (strcmp(proot->plan_name, rtinfo->plan_name) == 0)
{
/*
* If rtinfo->dummy is set, then the subquery's range table
@@ -2081,30 +2098,51 @@ pgpa_ri_checker_validate(pgpa_planner_state *pps, PlannedStmt *pstmt)
* RTE_SUBQUERY entries that were once RTE_RELATION entries
* will be copied, as per add_rtes_to_flat_rtable. Therefore,
* there's no fixed rtoffset that we can apply to the RTIs
- * used during planning to locate the corresponding relations
+ * used during planning to locate the corresponding relations.
*/
- if (strcmp(proot->plan_name, rtinfo->plan_name) == 0
- && !rtinfo->dummy)
+ if (rtinfo->dummy)
{
- rtoffset = rtinfo->rtoffset;
- Assert(rtoffset > 0);
+ /*
+ * It will not be possible to make any effective use of the
+ * sj_unique_rels list in this case, and it also won't be
+ * important to do so. So just throw the list away to avoid
+ * confusing pgpa_plan_walker.
+ */
+ proot->sj_unique_rels = NIL;
break;
}
+ Assert(!proot->has_rtoffset);
+ proot->has_rtoffset = true;
+ proot->rtoffset = rtinfo->rtoffset;
+ break;
}
-
- /*
- * It's not an error if we don't find the plan name: that just
- * means that we planned a subplan by this name but it ended up
- * being a dummy subplan and so wasn't included in the final plan
- * tree.
- */
- if (rtoffset == 0)
- continue;
}
+ }
+}
+
+/*
+ * Validate that the range table identifiers we were able to generate during
+ * planning match the ones we generated from the final plan.
+ */
+static void
+pgpa_validate_rt_identifiers(pgpa_planner_state *pps, PlannedStmt *pstmt)
+{
+#ifdef USE_ASSERT_CHECKING
+ pgpa_identifier *rt_identifiers;
+ Index rtable_length = list_length(pstmt->rtable);
+
+ /* Create identifiers from the planned statement. */
+ rt_identifiers = pgpa_create_identifiers_for_planned_stmt(pstmt);
+
+ /* Iterate over identifiers created during planning, so we can compare. */
+ foreach_ptr(pgpa_planner_info, proot, pps->proots)
+ {
+ if (!proot->has_rtoffset)
+ continue;
for (int rti = 1; rti <= proot->rid_array_size; ++rti)
{
- Index flat_rti = rtoffset + rti;
+ Index flat_rti = proot->rtoffset + rti;
pgpa_identifier *rid1 = &proot->rid_array[rti - 1];
pgpa_identifier *rid2;
diff --git a/contrib/pg_plan_advice/pgpa_planner.h b/contrib/pg_plan_advice/pgpa_planner.h
index e9045f69bca..93fda2055b2 100644
--- a/contrib/pg_plan_advice/pgpa_planner.h
+++ b/contrib/pg_plan_advice/pgpa_planner.h
@@ -24,11 +24,33 @@ typedef struct pgpa_planner_info
/* Plan name taken from the corresponding PlannerInfo; NULL at top level. */
char *plan_name;
-#ifdef USE_ASSERT_CHECKING
+ /*
+ * If the corresponding PlannerInfo has an alternative_root, then this is
+ * the plan name from that PlannerInfo; otherwise, it is the same as
+ * plan_name.
+ *
+ * is_alternative_plan is set to true for every pgpa_planner_info that
+ * shares an alternative_plan_name with at least one other, and to false
+ * otherwise.
+ */
+ char *alternative_plan_name;
+ bool is_alternative_plan;
+
/* Relation identifiers computed for baserels at this query level. */
pgpa_identifier *rid_array;
int rid_array_size;
-#endif
+
+ /*
+ * If has_rtoffset is true, then rtoffset is the offset required to align
+ * RTIs for this query level with RTIs from the final, flattened rangetable.
+ * If has_rtoffset is false, then this subquery's range table wasn't copied,
+ * or was only partially copied, into the final range table. (Note that
+ * we can't determine the rtoffset values until the final range table
+ * actually exists; before that time, has_rtoffset will be false everywhere
+ * except at the top level.)
+ */
+ bool has_rtoffset;
+ Index rtoffset;
/*
* List of Bitmapset objects. Each represents the relid set of a relation
diff --git a/contrib/pg_plan_advice/pgpa_trove.c b/contrib/pg_plan_advice/pgpa_trove.c
index 634ec5c4c6e..7ade0b5ca9c 100644
--- a/contrib/pg_plan_advice/pgpa_trove.c
+++ b/contrib/pg_plan_advice/pgpa_trove.c
@@ -162,6 +162,7 @@ pgpa_build_trove(List *advice_items)
break;
case PGPA_TAG_BITMAP_HEAP_SCAN:
+ case PGPA_TAG_DO_NOT_SCAN:
case PGPA_TAG_INDEX_ONLY_SCAN:
case PGPA_TAG_INDEX_SCAN:
case PGPA_TAG_SEQ_SCAN:
diff --git a/contrib/pg_plan_advice/pgpa_walker.c b/contrib/pg_plan_advice/pgpa_walker.c
index 6fbc784bf54..0a4512d4921 100644
--- a/contrib/pg_plan_advice/pgpa_walker.c
+++ b/contrib/pg_plan_advice/pgpa_walker.c
@@ -59,6 +59,10 @@ static bool pgpa_walker_contains_join(pgpa_plan_walker_context *walker,
Bitmapset *relids);
static bool pgpa_walker_contains_no_gather(pgpa_plan_walker_context *walker,
Bitmapset *relids);
+static void pgpa_classify_alternative_subplans(pgpa_plan_walker_context *walker,
+ List *proots,
+ List **chosen_proots,
+ List **discarded_proots);
/*
* Top-level entrypoint for the plan tree walk.
@@ -75,6 +79,8 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
ListCell *lc;
List *sj_unique_rtis = NULL;
List *sj_nonunique_qfs = NULL;
+ List *chosen_proots;
+ List *discarded_proots;
/* Initialization. */
memset(walker, 0, sizeof(pgpa_plan_walker_context));
@@ -95,42 +101,23 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
/* Adjust RTIs from sj_unique_rels for the flattened range table. */
foreach_ptr(pgpa_planner_info, proot, proots)
{
- int rtoffset = 0;
- bool dummy = false;
-
/* If there are no sj_unique_rels for this proot, we can skip it. */
if (proot->sj_unique_rels == NIL)
continue;
/* If this is a subplan, find the range table offset. */
- if (proot->plan_name != NULL)
- {
- foreach_node(SubPlanRTInfo, rtinfo, pstmt->subrtinfos)
- {
- if (strcmp(proot->plan_name, rtinfo->plan_name) == 0)
- {
- rtoffset = rtinfo->rtoffset;
- dummy = rtinfo->dummy;
- break;
- }
- }
-
- if (rtoffset == 0)
- elog(ERROR, "no rtoffset for plan %s", proot->plan_name);
- }
+ if (!proot->has_rtoffset)
+ elog(ERROR, "no rtoffset for plan %s", proot->plan_name);
- /* If this entry pertains to a dummy subquery, ignore it. */
- if (dummy)
- continue;
-
- /* Offset each relid set by the rtoffset we just computed. */
+ /* Offset each relid set by the proot's rtoffset. */
foreach_node(Bitmapset, relids, proot->sj_unique_rels)
{
int rtindex = -1;
Bitmapset *flat_relids = NULL;
while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
- flat_relids = bms_add_member(flat_relids, rtindex + rtoffset);
+ flat_relids = bms_add_member(flat_relids,
+ rtindex + proot->rtoffset);
sj_unique_rtis = lappend(sj_unique_rtis, flat_relids);
}
@@ -193,6 +180,42 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, PlannedStmt *pstmt,
walker->query_features[t] = query_features;
}
+
+ /* Classify alternative subplans. */
+ pgpa_classify_alternative_subplans(walker, proots,
+ &chosen_proots, &discarded_proots);
+
+ /*
+ * Figure out which of the discarded alternatives have a non-discarded
+ * alternative. Those are the ones for which we want to emit DO_NOT_SCAN
+ * advice. (If every alternative was discarded, then there's no point.)
+ */
+ foreach_ptr(pgpa_planner_info, discarded_proot, discarded_proots)
+ {
+ bool some_alternative_chosen = false;
+
+ foreach_ptr(pgpa_planner_info, chosen_proot, chosen_proots)
+ {
+ if (strings_equal_or_both_null(discarded_proot->alternative_plan_name,
+ chosen_proot->alternative_plan_name))
+ {
+ some_alternative_chosen = true;
+ break;
+ }
+ }
+
+ if (some_alternative_chosen)
+ {
+ for (int rti = 1; rti <= discarded_proot->rid_array_size; rti++)
+ {
+ pgpa_identifier *rid = &discarded_proot->rid_array[rti - 1];
+
+ if (rid->alias_name != NULL)
+ walker->do_not_scan_identifiers =
+ lappend(walker->do_not_scan_identifiers, rid);
+ }
+ }
+ }
}
/*
@@ -697,6 +720,30 @@ pgpa_walker_would_advise(pgpa_plan_walker_context *walker,
return false;
}
+ /*
+ * DO_NOT_SCAN advice targets rels that may not be in the flat range table
+ * (e.g. MinMaxAgg losers), so we can't use pgpa_compute_rti_from_identifier.
+ * Instead, check directly against the do_not_scan_identifiers list.
+ */
+ if (tag == PGPA_TAG_DO_NOT_SCAN)
+ {
+ if (target->ttype != PGPA_TARGET_IDENTIFIER)
+ return false;
+ foreach_ptr(pgpa_identifier, rid, walker->do_not_scan_identifiers)
+ {
+ if (strcmp(rid->alias_name, target->rid.alias_name) == 0 &&
+ rid->occurrence == target->rid.occurrence &&
+ strings_equal_or_both_null(rid->partnsp,
+ target->rid.partnsp) &&
+ strings_equal_or_both_null(rid->partrel,
+ target->rid.partrel) &&
+ strings_equal_or_both_null(rid->plan_name,
+ target->rid.plan_name))
+ return true;
+ }
+ return false;
+ }
+
if (target->ttype == PGPA_TARGET_IDENTIFIER)
{
Index rti;
@@ -730,6 +777,10 @@ pgpa_walker_would_advise(pgpa_plan_walker_context *walker,
/* should have been handled above */
pg_unreachable();
break;
+ case PGPA_TAG_DO_NOT_SCAN:
+ /* should have been handled above */
+ pg_unreachable();
+ break;
case PGPA_TAG_BITMAP_HEAP_SCAN:
return pgpa_walker_find_scan(walker,
PGPA_SCAN_BITMAP_HEAP,
@@ -1035,3 +1086,60 @@ pgpa_walker_contains_no_gather(pgpa_plan_walker_context *walker,
{
return bms_is_subset(relids, walker->no_gather_scans);
}
+
+/*
+ * Classify alternative subplans as chosen or discarded.
+ */
+static void
+pgpa_classify_alternative_subplans(pgpa_plan_walker_context *walker,
+ List *proots,
+ List **chosen_proots,
+ List **discarded_proots)
+{
+ Bitmapset *all_scan_rtis = NULL;
+
+ /* Initialize both output lists to empty. */
+ *chosen_proots = NIL;
+ *discarded_proots = NIL;
+
+ /* Collect all scan RTIs. */
+ for (int s = 0; s < NUM_PGPA_SCAN_STRATEGY; s++)
+ foreach_ptr(pgpa_scan, scan, walker->scans[s])
+ all_scan_rtis = bms_add_members(all_scan_rtis, scan->relids);
+
+ /* Now classify each subplan. */
+ foreach_ptr(pgpa_planner_info, proot, proots)
+ {
+ bool chosen = false;
+
+ /*
+ * We're only interested in classifying subplans for which there are
+ * alternatives.
+ */
+ if (!proot->is_alternative_plan)
+ continue;
+
+ /*
+ * A subplan has been chosen if any of its scan RTIs appear in the
+ * final plan. This cannot be the case if it has no RT offset.
+ */
+ if (proot->has_rtoffset)
+ {
+ for (int rti = 1; rti <= proot->rid_array_size; rti++)
+ {
+ if (proot->rid_array[rti - 1].alias_name != NULL &&
+ bms_is_member(proot->rtoffset + rti, all_scan_rtis))
+ {
+ chosen = true;
+ break;
+ }
+ }
+ }
+
+ /* Add it to the correct list. */
+ if (chosen)
+ *chosen_proots = lappend(*chosen_proots, proot);
+ else
+ *discarded_proots = lappend(*discarded_proots, proot);
+ }
+}
diff --git a/contrib/pg_plan_advice/pgpa_walker.h b/contrib/pg_plan_advice/pgpa_walker.h
index 9b74cd3ba55..47667c03374 100644
--- a/contrib/pg_plan_advice/pgpa_walker.h
+++ b/contrib/pg_plan_advice/pgpa_walker.h
@@ -100,6 +100,7 @@ typedef struct pgpa_plan_walker_context
List *join_strategies[NUM_PGPA_JOIN_STRATEGY];
List *query_features[NUM_PGPA_QF_TYPES];
List *future_query_features;
+ List *do_not_scan_identifiers;
} pgpa_plan_walker_context;
extern void pgpa_plan_walker(pgpa_plan_walker_context *walker,
diff --git a/contrib/pg_plan_advice/sql/alternatives.sql b/contrib/pg_plan_advice/sql/alternatives.sql
new file mode 100644
index 00000000000..16299edd196
--- /dev/null
+++ b/contrib/pg_plan_advice/sql/alternatives.sql
@@ -0,0 +1,58 @@
+LOAD 'pg_plan_advice';
+SET max_parallel_workers_per_gather = 0;
+
+CREATE TABLE alt_t1 (a int) WITH (autovacuum_enabled = false);
+CREATE TABLE alt_t2 (a int) WITH (autovacuum_enabled = false);
+CREATE INDEX ON alt_t2(a);
+INSERT INTO alt_t1 SELECT generate_series(1, 1000);
+INSERT INTO alt_t2 SELECT generate_series(1, 100000);
+VACUUM ANALYZE alt_t1;
+VACUUM ANALYZE alt_t2;
+
+-- This query uses an OR to prevent the EXISTS from being converted to a
+-- semi-join, forcing the planner through the AlternativeSubPlan path.
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+
+-- We should be able to force either AlternativeSubPlan by advising against
+-- scanning the other relation.
+BEGIN;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_t2@exists_1)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_t2@exists_2)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT * FROM alt_t1
+WHERE EXISTS (SELECT 1 FROM alt_t2 WHERE alt_t2.a = alt_t1.a) OR alt_t1.a < 0;
+COMMIT;
+
+-- Now let's test a case involving MinMaxAggPath, which we treat similarly
+-- to the AlternativeSubPlan case.
+CREATE TABLE alt_minmax (a int) WITH (autovacuum_enabled = false);
+CREATE INDEX ON alt_minmax(a);
+INSERT INTO alt_minmax SELECT generate_series(1, 10000);
+VACUUM ANALYZE alt_minmax;
+
+-- Using an Index Scan inside of an InitPlan should win over a full table
+-- scan.
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+
+-- Advising against the scan of alt_minmax at the root query level should
+-- change nothing, but if we say we don't want either of or both of the
+-- minmax-variant scans, the plan should switch to a full table scan.
+BEGIN;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax@minmax_1)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(alt_minmax@minmax_1) DO_NOT_SCAN(alt_minmax@minmax_2)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE)
+SELECT min(a), max(a) FROM alt_minmax;
+COMMIT;
+
+DROP TABLE alt_t1, alt_t2, alt_minmax;
diff --git a/contrib/pg_plan_advice/sql/scan.sql b/contrib/pg_plan_advice/sql/scan.sql
index 4fc494c7d8e..800ff7a4622 100644
--- a/contrib/pg_plan_advice/sql/scan.sql
+++ b/contrib/pg_plan_advice/sql/scan.sql
@@ -79,7 +79,8 @@ COMMIT;
-- We can force a primary key lookup to use a sequential scan, but we
-- can't force it to use an index-only scan (due to the column list)
--- or a TID scan (due to the absence of a TID qual).
+-- or a TID scan (due to the absence of a TID qual). If we apply DO_NOT_SCAN
+-- here, we should get a valid plan anyway, but with the scan disabled.
BEGIN;
SET LOCAL pg_plan_advice.advice = 'SEQ_SCAN(scan_table)';
EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
@@ -87,6 +88,8 @@ SET LOCAL pg_plan_advice.advice = 'INDEX_ONLY_SCAN(scan_table scan_table_pkey)';
EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
SET LOCAL pg_plan_advice.advice = 'TID_SCAN(scan_table)';
EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
+SET LOCAL pg_plan_advice.advice = 'DO_NOT_SCAN(scan_table)';
+EXPLAIN (COSTS OFF, PLAN_ADVICE) SELECT * FROM scan_table WHERE a = 1;
COMMIT;
-- We can forcibly downgrade an index-only scan to an index scan, but we can't
diff --git a/doc/src/sgml/pgplanadvice.sgml b/doc/src/sgml/pgplanadvice.sgml
index 8df8a978ecf..c3e1ccb60a2 100644
--- a/doc/src/sgml/pgplanadvice.sgml
+++ b/doc/src/sgml/pgplanadvice.sgml
@@ -267,7 +267,8 @@ TID_SCAN(<replaceable>target</replaceable> [ ... ])
INDEX_SCAN(<replaceable>target</replaceable> <replaceable>index_name</replaceable> [ ... ])
INDEX_ONLY_SCAN(<replaceable>target</replaceable> <replaceable>index_name</replaceable> [ ... ])
FOREIGN_SCAN((<replaceable>target</replaceable> [ ... ]) [ ... ])
-BITMAP_HEAP_SCAN(<replaceable>target</replaceable> [ ... ])</synopsis>
+BITMAP_HEAP_SCAN(<replaceable>target</replaceable> [ ... ])
+DO_NOT_SCAN(<replaceable>target</replaceable> [ ... ])</synopsis>
<para>
<literal>SEQ_SCAN</literal> specifies that each target should be
@@ -297,6 +298,17 @@ BITMAP_HEAP_SCAN(<replaceable>target</replaceable> [ ... ])</synopsis>
that purpose.
</para>
+ <para>
+ <literal>DO_NOT_SCAN</literal> specifies that a particular target
+ should not appear in the final plan at all. In most cases, this is
+ impossible, and will simply cause the scan of the target relation to
+ be marked disabled. However, in certain cases, the planner considers
+ optimizations where a portion of the plan tree is copied and mutated,
+ and then considered as an alternative to the original. In those cases,
+ <literal>DO_NOT_SCAN</literal> can be used to exclude the non-preferred
+ alternative.
+ </para>
+
<para>
The planner supports many types of scans other than those listed here;
however, in most of those cases, there is no meaningful decision to be
--
2.51.0
[application/octet-stream] v22-0001-Respect-disabled_nodes-in-fix_alternative_subpla.patch (3.4K, 5-v22-0001-Respect-disabled_nodes-in-fix_alternative_subpla.patch)
download | inline diff:
From e892484828a09ce6ca6f8943c7b3e650a53050ce Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Fri, 20 Mar 2026 14:04:41 -0400
Subject: [PATCH v22 1/6] Respect disabled_nodes in fix_alternative_subplan.
When my commit 12444183e40187a9fb6002a3912053f302725f0a added the
concept of disabled_nodes, it failed to add a disabled_nodes field
to SubPlan. This is a regression: before that commit, when
fix_alternative_subplan compared the costs of two plans, the number
of disabled nodes affected the result, because it was just a
component of the total cost. After that commit, it no longer did,
making it possible for a disabled path to win on cost over one that
is not disabled. Fix that.
---
src/backend/optimizer/path/costsize.c | 1 +
src/backend/optimizer/plan/setrefs.c | 14 ++++++++++----
src/include/nodes/primnodes.h | 1 +
3 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 56d45287c89..1c575e56ff6 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4761,6 +4761,7 @@ cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan)
sp_cost.per_tuple += plan->startup_cost;
}
+ subplan->disabled_nodes = plan->disabled_nodes;
subplan->startup_cost = sp_cost.startup;
subplan->per_call_cost = sp_cost.per_tuple;
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1b5b9b5ed9c..ff0e875f2a2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2234,9 +2234,12 @@ fix_alternative_subplan(PlannerInfo *root, AlternativeSubPlan *asplan,
/*
* Compute the estimated cost of each subplan assuming num_exec
- * executions, and keep the cheapest one. In event of exact equality of
- * estimates, we prefer the later plan; this is a bit arbitrary, but in
- * current usage it biases us to break ties against fast-start subplans.
+ * executions, and keep the cheapest one. If one subplan has more
+ * disabled nodes than another, choose the one with fewer disabled nodes
+ * regardless of cost; this parallels compare_path_costs. In event of
+ * exact equality of estimates, we prefer the later plan; this is a bit
+ * arbitrary, but in current usage it biases us to break ties against
+ * fast-start subplans.
*/
Assert(asplan->subplans != NIL);
@@ -2246,7 +2249,10 @@ fix_alternative_subplan(PlannerInfo *root, AlternativeSubPlan *asplan,
Cost curcost;
curcost = curplan->startup_cost + num_exec * curplan->per_call_cost;
- if (bestplan == NULL || curcost <= bestcost)
+ if (bestplan == NULL ||
+ curplan->disabled_nodes < bestplan->disabled_nodes ||
+ (curplan->disabled_nodes == bestplan->disabled_nodes &&
+ curcost <= bestcost))
{
bestplan = curplan;
bestcost = curcost;
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index b67e56e6c5a..f5b6b45664a 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1124,6 +1124,7 @@ typedef struct SubPlan
List *parParam; /* indices of input Params from parent plan */
List *args; /* exprs to pass as parParam values */
/* Estimated execution costs: */
+ int disabled_nodes; /* count of disabled nodes in the plan */
Cost startup_cost; /* one-time setup cost */
Cost per_call_cost; /* cost for each subplan evaluation */
} SubPlan;
--
2.51.0
[application/octet-stream] v22-0005-Add-pg_collect_advice-contrib-module.patch (56.5K, 6-v22-0005-Add-pg_collect_advice-contrib-module.patch)
download | inline diff:
From bdee22713c7956fc09af3083b05c26954ae6a97f Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Thu, 26 Feb 2026 16:51:16 -0500
Subject: [PATCH v22 5/6] Add pg_collect_advice contrib module.
This module allows for bulk collection of queries and the associated
plan advice strings using either backend-local memory or dynamic
shared memory. In either case, memory usage can be limited by
restriction the maximum number of queries and advice strings stored.
Care should be taken with these values, and with the use of this
module in general, because it's easy to chew up an unreasonably large
amount of memory. Unlike pg_stat_statements, this module does not
provide for query normalization or even deduplication; it simply makes
a record for every query planned.
It can be useful to enable query ID computaton before using the
module, but it's not required. If not done, all queries will simply
show a query ID of zero.
Reviewed-by: Alexandra Wang <[email protected]>
---
contrib/Makefile | 1 +
contrib/meson.build | 1 +
contrib/pg_collect_advice/Makefile | 26 +
contrib/pg_collect_advice/collector.c | 649 ++++++++++++++++++
.../expected/local_collector.out | 69 ++
contrib/pg_collect_advice/interface.c | 303 ++++++++
contrib/pg_collect_advice/meson.build | 41 ++
.../pg_collect_advice--1.0.sql | 43 ++
.../pg_collect_advice.control | 5 +
contrib/pg_collect_advice/pg_collect_advice.h | 39 ++
.../pg_collect_advice/sql/local_collector.sql | 46 ++
.../t/001_shared_collector.pl | 154 +++++
doc/src/sgml/contrib.sgml | 1 +
doc/src/sgml/filelist.sgml | 1 +
doc/src/sgml/pgcollectadvice.sgml | 244 +++++++
src/tools/pgindent/typedefs.list | 6 +
16 files changed, 1629 insertions(+)
create mode 100644 contrib/pg_collect_advice/Makefile
create mode 100644 contrib/pg_collect_advice/collector.c
create mode 100644 contrib/pg_collect_advice/expected/local_collector.out
create mode 100644 contrib/pg_collect_advice/interface.c
create mode 100644 contrib/pg_collect_advice/meson.build
create mode 100644 contrib/pg_collect_advice/pg_collect_advice--1.0.sql
create mode 100644 contrib/pg_collect_advice/pg_collect_advice.control
create mode 100644 contrib/pg_collect_advice/pg_collect_advice.h
create mode 100644 contrib/pg_collect_advice/sql/local_collector.sql
create mode 100644 contrib/pg_collect_advice/t/001_shared_collector.pl
create mode 100644 doc/src/sgml/pgcollectadvice.sgml
diff --git a/contrib/Makefile b/contrib/Makefile
index dd04c20acd2..22071034e51 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -31,6 +31,7 @@ SUBDIRS = \
pageinspect \
passwordcheck \
pg_buffercache \
+ pg_collect_advice \
pg_freespacemap \
pg_logicalinspect \
pg_overexplain \
diff --git a/contrib/meson.build b/contrib/meson.build
index 5a752eac347..ff422d9b7fc 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -45,6 +45,7 @@ subdir('pageinspect')
subdir('passwordcheck')
subdir('pg_buffercache')
subdir('pgcrypto')
+subdir('pg_collect_advice')
subdir('pg_freespacemap')
subdir('pg_logicalinspect')
subdir('pg_overexplain')
diff --git a/contrib/pg_collect_advice/Makefile b/contrib/pg_collect_advice/Makefile
new file mode 100644
index 00000000000..594c1bf82b2
--- /dev/null
+++ b/contrib/pg_collect_advice/Makefile
@@ -0,0 +1,26 @@
+# contrib/pg_collect_advice/Makefile
+
+MODULE_big = pg_collect_advice
+OBJS = \
+ $(WIN32RES) \
+ collector.o \
+ interface.o
+
+EXTENSION = pg_collect_advice
+DATA = pg_collect_advice--1.0.sql
+PGFILEDESC = "pg_collect_advice - collect queries and their plan advice strings"
+
+REGRESS = local_collector
+EXTRA_INSTALL = contrib/pg_plan_advice
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_collect_advice
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_collect_advice/collector.c b/contrib/pg_collect_advice/collector.c
new file mode 100644
index 00000000000..d9fc3238fbd
--- /dev/null
+++ b/contrib/pg_collect_advice/collector.c
@@ -0,0 +1,649 @@
+/*-------------------------------------------------------------------------
+ *
+ * collector.c
+ * workhorse for saving plan advice in backend-local or shared memory
+ *
+ * Copyright (c) 2016-2026, PostgreSQL Global Development Group
+ *
+ * contrib/pg_collect_advice/collector.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "pg_collect_advice.h"
+
+#include "datatype/timestamp.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "nodes/pg_list.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/timestamp.h"
+#include "utils/tuplestore.h"
+
+PG_FUNCTION_INFO_V1(pg_clear_collected_local_advice);
+PG_FUNCTION_INFO_V1(pg_clear_collected_shared_advice);
+PG_FUNCTION_INFO_V1(pg_get_collected_local_advice);
+PG_FUNCTION_INFO_V1(pg_get_collected_shared_advice);
+
+#define ADVICE_CHUNK_SIZE 1024
+#define ADVICE_CHUNK_ARRAY_SIZE 64
+
+#define PG_GET_ADVICE_COLUMNS 7
+
+/*
+ * Advice extracted from one query plan, together with the query string
+ * and various other identifying details.
+ */
+typedef struct pgca_collected_advice
+{
+ Oid userid; /* user OID */
+ Oid dbid; /* database OID */
+ uint64 queryid; /* query identifier */
+ TimestampTz timestamp; /* query timestamp */
+ int advice_offset; /* start of advice in textual data */
+ char textual_data[FLEXIBLE_ARRAY_MEMBER];
+} pgca_collected_advice;
+
+/*
+ * A bunch of pointers to pgca_collected_advice objects, stored in
+ * backend-local memory.
+ */
+typedef struct pgca_local_advice_chunk
+{
+ pgca_collected_advice *entries[ADVICE_CHUNK_SIZE];
+} pgca_local_advice_chunk;
+
+/*
+ * Information about all of the pgca_collected_advice objects that we're
+ * storing in local memory.
+ *
+ * We assign consecutive IDs, starting from 0, to each pgca_collected_advice
+ * object that we store. The actual storage is an array of chunks, which
+ * helps keep memcpy() overhead low when we start discarding older data.
+ */
+typedef struct pgca_local_advice
+{
+ uint64 next_id;
+ uint64 oldest_id;
+ uint64 base_id;
+ int chunk_array_allocated_size;
+ pgca_local_advice_chunk **chunks;
+} pgca_local_advice;
+
+/*
+ * Just like pgca_local_advice_chunk, but stored in a dynamic shared area,
+ * so we must use dsa_pointer instead of native pointers.
+ */
+typedef struct pgca_shared_advice_chunk
+{
+ dsa_pointer entries[ADVICE_CHUNK_SIZE];
+} pgca_shared_advice_chunk;
+
+/*
+ * Just like pgca_local_advice, but stored in a dynamic shared area, so
+ * we must use dsa_pointer instead of native pointers.
+ */
+typedef struct pgca_shared_advice
+{
+ uint64 next_id;
+ uint64 oldest_id;
+ uint64 base_id;
+ int chunk_array_allocated_size;
+ dsa_pointer chunks;
+} pgca_shared_advice;
+
+/* Pointers to local and shared collectors */
+static pgca_local_advice *local_collector = NULL;
+static pgca_shared_advice *shared_collector = NULL;
+
+/* Static functions */
+static pgca_collected_advice *make_collected_advice(Oid userid,
+ Oid dbid,
+ uint64 queryId,
+ TimestampTz timestamp,
+ const char *query_string,
+ const char *advice_string,
+ dsa_area *area,
+ dsa_pointer *result);
+static void store_local_advice(pgca_collected_advice *ca);
+static void trim_local_advice(int limit);
+static void store_shared_advice(dsa_pointer ca_pointer);
+static void trim_shared_advice(dsa_area *area, int limit);
+
+/* Helper function to extract the query string from pgca_collected_advice */
+static inline const char *
+query_string(pgca_collected_advice *ca)
+{
+ return ca->textual_data;
+}
+
+/* Helper function to extract the advice string from pgca_collected_advice */
+static inline const char *
+advice_string(pgca_collected_advice *ca)
+{
+ return ca->textual_data + ca->advice_offset;
+}
+
+/*
+ * Store collected query advice into the local or shared advice collector,
+ * as appropriate.
+ */
+void
+pg_collect_advice_save(uint64 queryId, const char *query_string,
+ const char *advice_string)
+{
+ Oid userid = GetUserId();
+ Oid dbid = MyDatabaseId;
+ TimestampTz now = GetCurrentTimestamp();
+
+ if (pg_collect_advice_local_collector &&
+ pg_collect_advice_local_collection_limit > 0)
+ {
+ pgca_collected_advice *ca;
+ MemoryContext oldcontext;
+
+ oldcontext = MemoryContextSwitchTo(pg_collect_advice_get_mcxt());
+ ca = make_collected_advice(userid, dbid, queryId, now,
+ query_string, advice_string,
+ NULL, NULL);
+ store_local_advice(ca);
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ if (pg_collect_advice_shared_collector &&
+ pg_collect_advice_shared_collection_limit > 0)
+ {
+ dsa_area *area = pg_collect_advice_dsa_area();
+ dsa_pointer ca_pointer = InvalidDsaPointer; /* placate compiler */
+
+ make_collected_advice(userid, dbid, queryId, now,
+ query_string, advice_string, area,
+ &ca_pointer);
+ store_shared_advice(ca_pointer);
+ }
+}
+
+/*
+ * Allocate and fill a new pgca_collected_advice object.
+ *
+ * If area != NULL, it is used to allocate the new object, and the resulting
+ * dsa_pointer is returned via *result.
+ *
+ * If area == NULL, the new object is allocated in the current memory context,
+ * and result is not examined or modified.
+ */
+static pgca_collected_advice *
+make_collected_advice(Oid userid, Oid dbid, uint64 queryId,
+ TimestampTz timestamp,
+ const char *query_string,
+ const char *advice_string,
+ dsa_area *area, dsa_pointer *result)
+{
+ size_t query_string_length = strlen(query_string) + 1;
+ size_t advice_string_length = strlen(advice_string) + 1;
+ size_t total_length;
+ pgca_collected_advice *ca;
+
+ total_length = offsetof(pgca_collected_advice, textual_data)
+ + query_string_length + advice_string_length;
+
+ if (area == NULL)
+ ca = palloc(total_length);
+ else
+ {
+ *result = dsa_allocate(area, total_length);
+ ca = dsa_get_address(area, *result);
+ }
+
+ ca->userid = userid;
+ ca->dbid = dbid;
+ ca->queryid = queryId;
+ ca->timestamp = timestamp;
+ ca->advice_offset = query_string_length;
+
+ memcpy(ca->textual_data, query_string, query_string_length);
+ memcpy(&ca->textual_data[ca->advice_offset],
+ advice_string, advice_string_length);
+
+ return ca;
+}
+
+/*
+ * Add a pgca_collected_advice object to our backend-local advice collection.
+ *
+ * Caller is responsible for switching to the appropriate memory context;
+ * the provided object should have been allocated in that same context.
+ */
+static void
+store_local_advice(pgca_collected_advice *ca)
+{
+ uint64 chunk_number;
+ uint64 chunk_offset;
+ pgca_local_advice *la = local_collector;
+
+ /* If the local advice collector isn't initialized yet, do that now. */
+ if (la == NULL)
+ {
+ la = palloc0(sizeof(pgca_local_advice));
+ la->chunk_array_allocated_size = ADVICE_CHUNK_ARRAY_SIZE;
+ la->chunks = palloc0_array(pgca_local_advice_chunk *,
+ la->chunk_array_allocated_size);
+ local_collector = la;
+ }
+
+ /* Compute chunk and offset at which to store this advice. */
+ chunk_number = (la->next_id - la->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (la->next_id - la->base_id) % ADVICE_CHUNK_SIZE;
+
+ /* Extend chunk array, if needed. */
+ if (chunk_number >= la->chunk_array_allocated_size)
+ {
+ int new_size;
+
+ new_size = la->chunk_array_allocated_size + ADVICE_CHUNK_ARRAY_SIZE;
+ la->chunks = repalloc0_array(la->chunks,
+ pgca_local_advice_chunk *,
+ la->chunk_array_allocated_size,
+ new_size);
+ la->chunk_array_allocated_size = new_size;
+ }
+
+ /* Allocate new chunk, if needed. */
+ if (la->chunks[chunk_number] == NULL)
+ la->chunks[chunk_number] = palloc0_object(pgca_local_advice_chunk);
+
+ /* Save pointer and bump next-id counter. */
+ Assert(la->chunks[chunk_number]->entries[chunk_offset] == NULL);
+ la->chunks[chunk_number]->entries[chunk_offset] = ca;
+ ++la->next_id;
+
+ /* If we've exceeded the storage limit, discard old data. */
+ trim_local_advice(pg_collect_advice_local_collection_limit);
+}
+
+/*
+ * Add a pgca_collected_advice object to the shared advice collection.
+ *
+ * 'ca_pointer' should have been allocated from the pg_collect_advice DSA area
+ * and should point to an object of type pgca_collected_advice.
+ */
+static void
+store_shared_advice(dsa_pointer ca_pointer)
+{
+ uint64 chunk_number;
+ uint64 chunk_offset;
+ pgca_shared_state *state = pg_collect_advice_attach();
+ dsa_area *area = pg_collect_advice_dsa_area();
+ pgca_shared_advice *sa = shared_collector;
+ dsa_pointer *chunk_array;
+ pgca_shared_advice_chunk *chunk;
+
+ /* Lock the shared state. */
+ LWLockAcquire(&state->lock, LW_EXCLUSIVE);
+
+ /*
+ * If we're not attached to the shared advice collector yet, fix that now.
+ * If we're the first ones to attach, we may need to create the object.
+ */
+ if (sa == NULL)
+ {
+ if (state->shared_collector == InvalidDsaPointer)
+ state->shared_collector =
+ dsa_allocate0(area, sizeof(pgca_shared_advice));
+ shared_collector = sa = dsa_get_address(area, state->shared_collector);
+ }
+
+ /*
+ * It's possible that some other backend may have succeeded in creating
+ * the main collector object but failed to allocate an initial chunk
+ * array, so we must be prepared to allocate the chunk array here whether
+ * or not we created the collector object.
+ */
+ if (shared_collector->chunk_array_allocated_size == 0)
+ {
+ sa->chunks =
+ dsa_allocate0(area,
+ sizeof(dsa_pointer) * ADVICE_CHUNK_ARRAY_SIZE);
+ sa->chunk_array_allocated_size = ADVICE_CHUNK_ARRAY_SIZE;
+ }
+
+ /* Compute chunk and offset at which to store this advice. */
+ chunk_number = (sa->next_id - sa->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (sa->next_id - sa->base_id) % ADVICE_CHUNK_SIZE;
+
+ /* Get the address of the chunk array and, if needed, extend it. */
+ if (chunk_number >= sa->chunk_array_allocated_size)
+ {
+ int new_size;
+ dsa_pointer new_chunks;
+
+ /*
+ * DSA can't enlarge an existing allocation, so we must make a new
+ * allocation and copy data over.
+ */
+ new_size = sa->chunk_array_allocated_size + ADVICE_CHUNK_ARRAY_SIZE;
+ new_chunks = dsa_allocate0(area, sizeof(dsa_pointer) * new_size);
+ chunk_array = dsa_get_address(area, new_chunks);
+ memcpy(chunk_array, dsa_get_address(area, sa->chunks),
+ sizeof(dsa_pointer) * sa->chunk_array_allocated_size);
+ dsa_free(area, sa->chunks);
+ sa->chunks = new_chunks;
+ sa->chunk_array_allocated_size = new_size;
+ }
+ else
+ chunk_array = dsa_get_address(area, sa->chunks);
+
+ /* Get the address of the desired chunk, allocating it if needed. */
+ if (chunk_array[chunk_number] == InvalidDsaPointer)
+ chunk_array[chunk_number] =
+ dsa_allocate0(area, sizeof(pgca_shared_advice_chunk));
+ chunk = dsa_get_address(area, chunk_array[chunk_number]);
+
+ /* Save pointer and bump next-id counter. */
+ Assert(chunk->entries[chunk_offset] == InvalidDsaPointer);
+ chunk->entries[chunk_offset] = ca_pointer;
+ ++sa->next_id;
+
+ /* If we've exceeded the storage limit, discard old data. */
+ trim_shared_advice(area, pg_collect_advice_shared_collection_limit);
+
+ /* Release lock on shared state. */
+ LWLockRelease(&state->lock);
+}
+
+/*
+ * Discard collected advice stored in backend-local memory in excess of the
+ * specified limit.
+ */
+static void
+trim_local_advice(int limit)
+{
+ pgca_local_advice *la = local_collector;
+ uint64 current_count;
+ uint64 trim_count;
+ uint64 total_chunk_count;
+ uint64 trim_chunk_count;
+ uint64 remaining_chunk_count;
+
+ /* If we haven't yet reached the limit, there's nothing to do. */
+ current_count = la->next_id - la->oldest_id;
+ if (current_count <= limit)
+ return;
+
+ /* Free enough entries to get us back down to the limit. */
+ trim_count = current_count - limit;
+ while (trim_count > 0)
+ {
+ uint64 chunk_number;
+ uint64 chunk_offset;
+
+ chunk_number = (la->oldest_id - la->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (la->oldest_id - la->base_id) % ADVICE_CHUNK_SIZE;
+
+ Assert(la->chunks[chunk_number]->entries[chunk_offset] != NULL);
+ pfree(la->chunks[chunk_number]->entries[chunk_offset]);
+ la->chunks[chunk_number]->entries[chunk_offset] = NULL;
+ ++la->oldest_id;
+ --trim_count;
+ }
+
+ /* Free any chunks that are now entirely unused. */
+ trim_chunk_count = (la->oldest_id - la->base_id) / ADVICE_CHUNK_SIZE;
+ for (uint64 n = 0; n < trim_chunk_count; ++n)
+ pfree(la->chunks[n]);
+
+ /* Slide remaining chunk pointers back toward the base of the array. */
+ total_chunk_count = (la->next_id - la->base_id +
+ ADVICE_CHUNK_SIZE - 1) / ADVICE_CHUNK_SIZE;
+ remaining_chunk_count = total_chunk_count - trim_chunk_count;
+ if (remaining_chunk_count > 0)
+ memmove(&la->chunks[0], &la->chunks[trim_chunk_count],
+ sizeof(pgca_local_advice_chunk *) * remaining_chunk_count);
+
+ /* Don't leave stale pointers around. */
+ memset(&la->chunks[remaining_chunk_count], 0,
+ sizeof(pgca_local_advice_chunk *)
+ * (total_chunk_count - remaining_chunk_count));
+
+ /* Adjust base ID value accordingly. */
+ la->base_id += trim_chunk_count * ADVICE_CHUNK_SIZE;
+}
+
+/*
+ * Discard collected advice stored in shared memory in excess of the
+ * specified limit.
+ */
+static void
+trim_shared_advice(dsa_area *area, int limit)
+{
+ pgca_shared_advice *sa = shared_collector;
+ uint64 current_count;
+ uint64 trim_count;
+ uint64 total_chunk_count;
+ uint64 trim_chunk_count;
+ uint64 remaining_chunk_count;
+ dsa_pointer *chunk_array;
+
+ /* If we haven't yet reached the limit, there's nothing to do. */
+ current_count = sa->next_id - sa->oldest_id;
+ if (current_count <= limit)
+ return;
+
+ /* Get a pointer to the chunk array. */
+ chunk_array = dsa_get_address(area, sa->chunks);
+
+ /* Free enough entries to get us back down to the limit. */
+ trim_count = current_count - limit;
+ while (trim_count > 0)
+ {
+ uint64 chunk_number;
+ uint64 chunk_offset;
+ pgca_shared_advice_chunk *chunk;
+
+ chunk_number = (sa->oldest_id - sa->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (sa->oldest_id - sa->base_id) % ADVICE_CHUNK_SIZE;
+
+ chunk = dsa_get_address(area, chunk_array[chunk_number]);
+ Assert(chunk->entries[chunk_offset] != InvalidDsaPointer);
+ dsa_free(area, chunk->entries[chunk_offset]);
+ chunk->entries[chunk_offset] = InvalidDsaPointer;
+ ++sa->oldest_id;
+ --trim_count;
+ }
+
+ /* Free any chunks that are now entirely unused. */
+ trim_chunk_count = (sa->oldest_id - sa->base_id) / ADVICE_CHUNK_SIZE;
+ for (uint64 n = 0; n < trim_chunk_count; ++n)
+ dsa_free(area, chunk_array[n]);
+
+ /* Slide remaining chunk pointers back toward the base of the array. */
+ total_chunk_count = (sa->next_id - sa->base_id +
+ ADVICE_CHUNK_SIZE - 1) / ADVICE_CHUNK_SIZE;
+ remaining_chunk_count = total_chunk_count - trim_chunk_count;
+ if (remaining_chunk_count > 0)
+ memmove(&chunk_array[0], &chunk_array[trim_chunk_count],
+ sizeof(dsa_pointer) * remaining_chunk_count);
+
+ /* Don't leave stale pointers around. */
+ memset(&chunk_array[remaining_chunk_count], 0,
+ sizeof(dsa_pointer) * (total_chunk_count - remaining_chunk_count));
+
+ /* Adjust base ID value accordingly. */
+ sa->base_id += trim_chunk_count * ADVICE_CHUNK_SIZE;
+}
+
+/*
+ * SQL-callable function to discard advice collected in backend-local memory
+ */
+Datum
+pg_clear_collected_local_advice(PG_FUNCTION_ARGS)
+{
+ if (local_collector != NULL)
+ trim_local_advice(0);
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * SQL-callable function to discard advice collected in shared memory
+ */
+Datum
+pg_clear_collected_shared_advice(PG_FUNCTION_ARGS)
+{
+ pgca_shared_state *state = pg_collect_advice_attach();
+ dsa_area *area = pg_collect_advice_dsa_area();
+
+ LWLockAcquire(&state->lock, LW_EXCLUSIVE);
+
+ /*
+ * If we're not attached to the shared advice collector yet, fix that now;
+ * but if the collector doesn't even exist, we can return without doing
+ * anything else.
+ */
+ if (shared_collector == NULL)
+ {
+ if (state->shared_collector == InvalidDsaPointer)
+ {
+ LWLockRelease(&state->lock);
+ return (Datum) 0;
+ }
+ shared_collector = dsa_get_address(area, state->shared_collector);
+ }
+
+ /* Do the real work */
+ trim_shared_advice(area, 0);
+
+ LWLockRelease(&state->lock);
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * SQL-callable SRF to return advice collected in backend-local memory
+ */
+Datum
+pg_get_collected_local_advice(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ pgca_local_advice *la = local_collector;
+ Oid userid = GetUserId();
+
+ InitMaterializedSRF(fcinfo, 0);
+
+ if (la == NULL)
+ return (Datum) 0;
+
+ /* Loop over all entries. */
+ for (uint64 id = la->oldest_id; id < la->next_id; ++id)
+ {
+ uint64 chunk_number;
+ uint64 chunk_offset;
+ pgca_collected_advice *ca;
+ Datum values[PG_GET_ADVICE_COLUMNS];
+ bool nulls[PG_GET_ADVICE_COLUMNS] = {0};
+
+ chunk_number = (id - la->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (id - la->base_id) % ADVICE_CHUNK_SIZE;
+
+ ca = la->chunks[chunk_number]->entries[chunk_offset];
+
+ if (!member_can_set_role(userid, ca->userid))
+ continue;
+
+ values[0] = UInt64GetDatum(id);
+ values[1] = ObjectIdGetDatum(ca->userid);
+ values[2] = ObjectIdGetDatum(ca->dbid);
+ values[3] = UInt64GetDatum(ca->queryid);
+ values[4] = TimestampTzGetDatum(ca->timestamp);
+ values[5] = CStringGetTextDatum(query_string(ca));
+ values[6] = CStringGetTextDatum(advice_string(ca));
+
+ tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
+ values, nulls);
+ }
+
+ return (Datum) 0;
+}
+
+/*
+ * SQL-callable SRF to return advice collected in shared memory
+ */
+Datum
+pg_get_collected_shared_advice(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ pgca_shared_state *state = pg_collect_advice_attach();
+ dsa_area *area = pg_collect_advice_dsa_area();
+ dsa_pointer *chunk_array;
+ pgca_shared_advice *sa = shared_collector;
+ Oid userid = GetUserId();
+
+ InitMaterializedSRF(fcinfo, 0);
+
+ /* Lock the shared state. */
+ LWLockAcquire(&state->lock, LW_SHARED);
+
+ /*
+ * If we're not attached to the shared advice collector yet, fix that now;
+ * but if the collector doesn't even exist, we can return without doing
+ * anything else.
+ */
+ if (sa == NULL)
+ {
+ if (state->shared_collector == InvalidDsaPointer)
+ {
+ LWLockRelease(&state->lock);
+ return (Datum) 0;
+ }
+ shared_collector = sa = dsa_get_address(area, state->shared_collector);
+ }
+
+ /* If there's no chunk array yet, there's nothing to do. */
+ if (sa->chunks == InvalidDsaPointer)
+ {
+ LWLockRelease(&state->lock);
+ return (Datum) 0;
+ }
+
+ /* Get a pointer to the chunk array. */
+ chunk_array = dsa_get_address(area, sa->chunks);
+
+ /* Loop over all entries. */
+ for (uint64 id = sa->oldest_id; id < sa->next_id; ++id)
+ {
+ uint64 chunk_number;
+ uint64 chunk_offset;
+ pgca_shared_advice_chunk *chunk;
+ pgca_collected_advice *ca;
+ Datum values[PG_GET_ADVICE_COLUMNS];
+ bool nulls[PG_GET_ADVICE_COLUMNS] = {0};
+
+ chunk_number = (id - sa->base_id) / ADVICE_CHUNK_SIZE;
+ chunk_offset = (id - sa->base_id) % ADVICE_CHUNK_SIZE;
+
+ chunk = dsa_get_address(area, chunk_array[chunk_number]);
+ ca = dsa_get_address(area, chunk->entries[chunk_offset]);
+
+ if (!member_can_set_role(userid, ca->userid))
+ continue;
+
+ values[0] = UInt64GetDatum(id);
+ values[1] = ObjectIdGetDatum(ca->userid);
+ values[2] = ObjectIdGetDatum(ca->dbid);
+ values[3] = UInt64GetDatum(ca->queryid);
+ values[4] = TimestampTzGetDatum(ca->timestamp);
+ values[5] = CStringGetTextDatum(query_string(ca));
+ values[6] = CStringGetTextDatum(advice_string(ca));
+
+ tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
+ values, nulls);
+ }
+
+ /* Release lock on shared state. */
+ LWLockRelease(&state->lock);
+
+ return (Datum) 0;
+}
diff --git a/contrib/pg_collect_advice/expected/local_collector.out b/contrib/pg_collect_advice/expected/local_collector.out
new file mode 100644
index 00000000000..f57b96ee835
--- /dev/null
+++ b/contrib/pg_collect_advice/expected/local_collector.out
@@ -0,0 +1,69 @@
+CREATE EXTENSION pg_collect_advice;
+SET debug_parallel_query = off;
+-- Try clearing advice before we've collected any.
+SELECT pg_clear_collected_local_advice();
+ pg_clear_collected_local_advice
+---------------------------------
+
+(1 row)
+
+-- Set a small advice collection limit so that we'll exceed it.
+SET pg_collect_advice.local_collection_limit = 2;
+-- Enable the collector.
+SET pg_collect_advice.local_collector = on;
+-- Set up a dummy table.
+CREATE TABLE dummy_table (a int primary key, b text)
+ WITH (autovacuum_enabled = false, parallel_workers = 0);
+-- Test queries.
+SELECT * FROM dummy_table a, dummy_table b;
+ a | b | a | b
+---+---+---+---
+(0 rows)
+
+SELECT * FROM dummy_table;
+ a | b
+---+---
+(0 rows)
+
+-- Should return the advice from the second test query.
+SET pg_collect_advice.local_collector = off;
+SELECT advice FROM pg_get_collected_local_advice() ORDER BY id DESC LIMIT 1;
+ advice
+------------------------
+ SEQ_SCAN(dummy_table) +
+ NO_GATHER(dummy_table)
+(1 row)
+
+-- Now try clearing advice again.
+SELECT pg_clear_collected_local_advice();
+ pg_clear_collected_local_advice
+---------------------------------
+
+(1 row)
+
+-- Raise the collection limit so that the collector uses multiple chunks.
+SET pg_collect_advice.local_collection_limit = 2000;
+SET pg_collect_advice.local_collector = on;
+-- Push a bunch of queries through the collector.
+DO $$
+BEGIN
+ FOR x IN 1..2000 LOOP
+ EXECUTE 'SELECT * FROM dummy_table';
+ END LOOP;
+END
+$$;
+-- Check that the collector worked.
+SELECT COUNT(*) FROM pg_get_collected_local_advice();
+ count
+-------
+ 2000
+(1 row)
+
+-- And clear one more time, to verify that this doesn't cause a problem
+-- even with a larger number of entries.
+SELECT pg_clear_collected_local_advice();
+ pg_clear_collected_local_advice
+---------------------------------
+
+(1 row)
+
diff --git a/contrib/pg_collect_advice/interface.c b/contrib/pg_collect_advice/interface.c
new file mode 100644
index 00000000000..feb11974152
--- /dev/null
+++ b/contrib/pg_collect_advice/interface.c
@@ -0,0 +1,303 @@
+/*-------------------------------------------------------------------------
+ *
+ * interface.c
+ * interface routines for the plan advice collector
+ *
+ * Copyright (c) 2016-2026, PostgreSQL Global Development Group
+ *
+ * contrib/pg_collect_advice/interface.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "pg_collect_advice.h"
+
+#include "funcapi.h"
+#include "optimizer/planner.h"
+#include "storage/dsm_registry.h"
+#include "utils/guc.h"
+
+PG_MODULE_MAGIC;
+
+/* Shared memory pointers */
+static pgca_shared_state *pgca_state = NULL;
+static dsa_area *pgca_dsa_area = NULL;
+
+/* GUC variables */
+bool pg_collect_advice_local_collector = false;
+int pg_collect_advice_local_collection_limit = 0;
+bool pg_collect_advice_shared_collector = false;
+int pg_collect_advice_shared_collection_limit = 0;
+
+/* Shadow variables for GUC assign hooks */
+static bool pg_collect_advice_local_collector_as_assigned = false;
+static bool pg_collect_advice_shared_collector_as_assigned = false;
+
+/* Other file-level globals */
+static void (*request_advice_generation_fn) (bool activate) = NULL;
+static planner_shutdown_hook_type prev_planner_shutdown = NULL;
+static MemoryContext pgca_memory_context = NULL;
+
+/* Function prototypes */
+static void pgca_init_shared_state(void *ptr, void *arg);
+static void pgca_planner_shutdown(PlannerGlobal *glob, Query *parse,
+ const char *query_string,
+ PlannedStmt *pstmt);
+static void pg_collect_advice_local_collector_assign_hook(bool newval,
+ void *extra);
+static void pg_collect_advice_shared_collector_assign_hook(bool newval,
+ void *extra);
+static DefElem *find_defelem_by_defname(List *deflist, char *defname);
+
+/*
+ * Initialize this module.
+ */
+void
+_PG_init(void)
+{
+ /*
+ * Get a pointer so we can call pg_plan_advice_request_advice_generation.
+ *
+ * We need to do this before defining custom GUCs; otherwise, our assign
+ * hook will try to use this function pointer before it's initialized.
+ *
+ * We also need to do this before installing our own hooks, so that if
+ * pg_plan_advice is not yet loaded, it will install its hooks before we
+ * install ours. (See comments in pgca_planner_shutdown.)
+ */
+ request_advice_generation_fn =
+ load_external_function("pg_plan_advice",
+ "pg_plan_advice_request_advice_generation",
+ true, NULL);
+
+ /* Define our GUCs. */
+ DefineCustomBoolVariable("pg_collect_advice.local_collector",
+ "Enable the local advice collector.",
+ NULL,
+ &pg_collect_advice_local_collector,
+ false,
+ PGC_USERSET,
+ 0,
+ NULL,
+ pg_collect_advice_local_collector_assign_hook,
+ NULL);
+
+ DefineCustomIntVariable("pg_collect_advice.local_collection_limit",
+ "# of advice entries to retain in per-backend memory",
+ NULL,
+ &pg_collect_advice_local_collection_limit,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomBoolVariable("pg_collect_advice.shared_collector",
+ "Enable the shared advice collector.",
+ NULL,
+ &pg_collect_advice_shared_collector,
+ false,
+ PGC_SUSET,
+ 0,
+ NULL,
+ pg_collect_advice_shared_collector_assign_hook,
+ NULL);
+
+ DefineCustomIntVariable("pg_collect_advice.shared_collection_limit",
+ "# of advice entries to retain in shared memory",
+ NULL,
+ &pg_collect_advice_shared_collection_limit,
+ 0,
+ 0, INT_MAX,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ MarkGUCPrefixReserved("pg_collect_advice");
+
+ /* Install hooks */
+ prev_planner_shutdown = planner_shutdown_hook;
+ planner_shutdown_hook = pgca_planner_shutdown;
+}
+
+/*
+ * Initialize shared state when first created.
+ */
+static void
+pgca_init_shared_state(void *ptr, void *arg)
+{
+ pgca_shared_state *state = (pgca_shared_state *) ptr;
+
+ LWLockInitialize(&state->lock,
+ LWLockNewTrancheId("pg_collect_advice_lock"));
+ state->dsa_tranche = LWLockNewTrancheId("pg_collect_advice_dsa");
+ state->area = DSA_HANDLE_INVALID;
+ state->shared_collector = InvalidDsaPointer;
+}
+
+/*
+ * Return a pointer to a memory context where long-lived data managed by this
+ * module can be stored.
+ */
+MemoryContext
+pg_collect_advice_get_mcxt(void)
+{
+ if (pgca_memory_context == NULL)
+ pgca_memory_context = AllocSetContextCreate(TopMemoryContext,
+ "pg_collect_advice",
+ ALLOCSET_DEFAULT_SIZES);
+
+ return pgca_memory_context;
+}
+
+/*
+ * Get a pointer to our shared state.
+ *
+ * If no shared state exists, create and initialize it. If it does exist but
+ * this backend has not yet accessed it, attach to it. Otherwise, just return
+ * our cached pointer.
+ */
+pgca_shared_state *
+pg_collect_advice_attach(void)
+{
+ if (pgca_state == NULL)
+ {
+ bool found;
+
+ pgca_state =
+ GetNamedDSMSegment("pg_collect_advice", sizeof(pgca_shared_state),
+ pgca_init_shared_state, &found, NULL);
+ }
+
+ return pgca_state;
+}
+
+/*
+ * Return a pointer to pg_collect_advice's DSA area, creating it if needed.
+ */
+dsa_area *
+pg_collect_advice_dsa_area(void)
+{
+ if (pgca_dsa_area == NULL)
+ {
+ pgca_shared_state *state = pg_collect_advice_attach();
+ dsa_handle area_handle;
+ MemoryContext oldcontext;
+
+ oldcontext = MemoryContextSwitchTo(pg_collect_advice_get_mcxt());
+
+ LWLockAcquire(&state->lock, LW_EXCLUSIVE);
+ area_handle = state->area;
+ if (area_handle == DSA_HANDLE_INVALID)
+ {
+ pgca_dsa_area = dsa_create(state->dsa_tranche);
+ dsa_pin(pgca_dsa_area);
+ state->area = dsa_get_handle(pgca_dsa_area);
+ LWLockRelease(&state->lock);
+ }
+ else
+ {
+ LWLockRelease(&state->lock);
+ pgca_dsa_area = dsa_attach(area_handle);
+ }
+
+ dsa_pin_mapping(pgca_dsa_area);
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ return pgca_dsa_area;
+}
+
+/*
+ * After planning is complete, retrieve the advice string, if present, and
+ * pass it through to the collector.
+ */
+static void
+pgca_planner_shutdown(PlannerGlobal *glob, Query *parse,
+ const char *query_string, PlannedStmt *pstmt)
+{
+ DefElem *pgpa_item;
+ DefElem *advice_string_item;
+ char *advice_string;
+
+ /*
+ * Pass call to previous hook.
+ *
+ * We want to be called after pg_plan_advice's shutdown hook has already
+ * executed. Our _PG_init() makes sure that pg_plan_advice's hooks are
+ * always loaded before ours, and here we pass the hook call down first,
+ * before doing our own work. The combination of those two things should
+ * be good enough to ensure that the advice string is already present when
+ * we go looking for it.
+ */
+ if (prev_planner_shutdown)
+ (*prev_planner_shutdown) (glob, parse, query_string, pstmt);
+
+ /* Fish out the advice string. If not found, do nothing. */
+ pgpa_item = find_defelem_by_defname(pstmt->extension_state,
+ "pg_plan_advice");
+ if (pgpa_item == NULL)
+ return;
+ advice_string_item = find_defelem_by_defname((List *) pgpa_item->arg,
+ "advice_string");
+ if (advice_string_item == NULL)
+ return;
+ advice_string = strVal(advice_string_item->arg);
+
+ /*
+ * Pass it through to the actual collector. But, if it's the empty string,
+ * we assume that collecting it is uninteresting.
+ */
+ if (advice_string[0] != '\0')
+ pg_collect_advice_save(pstmt->queryId, query_string, advice_string);
+}
+
+/*
+ * pgca_planner_shutdown won't find any advice to collect unless we've
+ * requested that it be generated. So, whenever the effective value of
+ * pg_collect_advice.local_collector changes, either make or
+ * revoke a request for advice generation.
+ */
+static void
+pg_collect_advice_local_collector_assign_hook(bool newval, void *extra)
+{
+ if (pg_collect_advice_local_collector_as_assigned && !newval)
+ (*request_advice_generation_fn) (false);
+ if (!pg_collect_advice_local_collector_as_assigned && newval)
+ (*request_advice_generation_fn) (true);
+ pg_collect_advice_local_collector_as_assigned = newval;
+}
+
+/*
+ * Same as above, but for pg_collect_advice.shared_collector
+ */
+static void
+pg_collect_advice_shared_collector_assign_hook(bool newval, void *extra)
+{
+ if (pg_collect_advice_shared_collector_as_assigned && !newval)
+ (*request_advice_generation_fn) (false);
+ if (!pg_collect_advice_shared_collector_as_assigned && newval)
+ (*request_advice_generation_fn) (true);
+ pg_collect_advice_shared_collector_as_assigned = newval;
+}
+
+/*
+ * Search a list of DefElem objects for a given defname.
+ */
+static DefElem *
+find_defelem_by_defname(List *deflist, char *defname)
+{
+ foreach_node(DefElem, item, deflist)
+ {
+ if (strcmp(item->defname, defname) == 0)
+ return item;
+ }
+
+ return NULL;
+}
diff --git a/contrib/pg_collect_advice/meson.build b/contrib/pg_collect_advice/meson.build
new file mode 100644
index 00000000000..102dc65d260
--- /dev/null
+++ b/contrib/pg_collect_advice/meson.build
@@ -0,0 +1,41 @@
+# Copyright (c) 2022-2026, PostgreSQL Global Development Group
+
+pg_collect_advice_sources = files(
+ 'collector.c',
+ 'interface.c',
+)
+
+if host_system == 'windows'
+ pg_collect_advice_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_collect_advice',
+ '--FILEDESC', 'pg_collect_advice - collect queries and their plan advice strings',])
+endif
+
+pg_collect_advice = shared_module('pg_collect_advice',
+ pg_collect_advice_sources,
+ include_directories: include_directories('.'),
+ kwargs: contrib_mod_args,
+)
+contrib_targets += pg_collect_advice
+
+install_data(
+ 'pg_collect_advice--1.0.sql',
+ 'pg_collect_advice.control',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'pg_collect_advice',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'local_collector',
+ ],
+ },
+ 'tap': {
+ 'tests': [
+ 't/001_shared_collector.pl',
+ ],
+ },
+}
diff --git a/contrib/pg_collect_advice/pg_collect_advice--1.0.sql b/contrib/pg_collect_advice/pg_collect_advice--1.0.sql
new file mode 100644
index 00000000000..0be86c54fc1
--- /dev/null
+++ b/contrib/pg_collect_advice/pg_collect_advice--1.0.sql
@@ -0,0 +1,43 @@
+/* contrib/pg_collect_advice/pg_collect_advice--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_collect_advice" to load this file. \quit
+
+CREATE FUNCTION pg_clear_collected_local_advice()
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_clear_collected_local_advice'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_clear_collected_shared_advice()
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_clear_collected_shared_advice'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_get_collected_local_advice(
+ OUT id bigint,
+ OUT userid oid,
+ OUT dbid oid,
+ OUT queryid bigint,
+ OUT collection_time timestamptz,
+ OUT query text,
+ OUT advice text
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_get_collected_local_advice'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_get_collected_shared_advice(
+ OUT id bigint,
+ OUT userid oid,
+ OUT dbid oid,
+ OUT queryid bigint,
+ OUT collection_time timestamptz,
+ OUT query text,
+ OUT advice text
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_get_collected_shared_advice'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION pg_clear_collected_shared_advice() FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_get_collected_shared_advice() FROM PUBLIC;
diff --git a/contrib/pg_collect_advice/pg_collect_advice.control b/contrib/pg_collect_advice/pg_collect_advice.control
new file mode 100644
index 00000000000..601e5e24ea1
--- /dev/null
+++ b/contrib/pg_collect_advice/pg_collect_advice.control
@@ -0,0 +1,5 @@
+# pg_collect_advice extension
+comment = 'collect queries and the associated plan advice'
+default_version = '1.0'
+module_pathname = '$libdir/pg_collect_advice'
+relocatable = true
diff --git a/contrib/pg_collect_advice/pg_collect_advice.h b/contrib/pg_collect_advice/pg_collect_advice.h
new file mode 100644
index 00000000000..480c2c633c4
--- /dev/null
+++ b/contrib/pg_collect_advice/pg_collect_advice.h
@@ -0,0 +1,39 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collect_advice.h
+ * definitions and declarations for pg_collect_advice module
+ *
+ * Copyright (c) 2016-2026, PostgreSQL Global Development Group
+ *
+ * contrib/pg_collect_advice/pg_collect_advice.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_COLLECT_ADVICE_H
+#define PG_COLLECT_ADVICE_H
+
+#include "storage/lwlock.h"
+#include "utils/dsa.h"
+
+typedef struct pgca_shared_state
+{
+ LWLock lock;
+ int dsa_tranche;
+ dsa_handle area;
+ dsa_pointer shared_collector;
+} pgca_shared_state;
+
+/* GUC variables */
+extern bool pg_collect_advice_local_collector;
+extern int pg_collect_advice_local_collection_limit;
+extern bool pg_collect_advice_shared_collector;
+extern int pg_collect_advice_shared_collection_limit;
+
+/* Function prototypes */
+extern MemoryContext pg_collect_advice_get_mcxt(void);
+extern pgca_shared_state *pg_collect_advice_attach(void);
+extern dsa_area *pg_collect_advice_dsa_area(void);
+extern void pg_collect_advice_save(uint64 queryId, const char *query_string,
+ const char *advice_string);
+
+#endif
diff --git a/contrib/pg_collect_advice/sql/local_collector.sql b/contrib/pg_collect_advice/sql/local_collector.sql
new file mode 100644
index 00000000000..41b187c5375
--- /dev/null
+++ b/contrib/pg_collect_advice/sql/local_collector.sql
@@ -0,0 +1,46 @@
+CREATE EXTENSION pg_collect_advice;
+SET debug_parallel_query = off;
+
+-- Try clearing advice before we've collected any.
+SELECT pg_clear_collected_local_advice();
+
+-- Set a small advice collection limit so that we'll exceed it.
+SET pg_collect_advice.local_collection_limit = 2;
+
+-- Enable the collector.
+SET pg_collect_advice.local_collector = on;
+
+-- Set up a dummy table.
+CREATE TABLE dummy_table (a int primary key, b text)
+ WITH (autovacuum_enabled = false, parallel_workers = 0);
+
+-- Test queries.
+SELECT * FROM dummy_table a, dummy_table b;
+SELECT * FROM dummy_table;
+
+-- Should return the advice from the second test query.
+SET pg_collect_advice.local_collector = off;
+SELECT advice FROM pg_get_collected_local_advice() ORDER BY id DESC LIMIT 1;
+
+-- Now try clearing advice again.
+SELECT pg_clear_collected_local_advice();
+
+-- Raise the collection limit so that the collector uses multiple chunks.
+SET pg_collect_advice.local_collection_limit = 2000;
+SET pg_collect_advice.local_collector = on;
+
+-- Push a bunch of queries through the collector.
+DO $$
+BEGIN
+ FOR x IN 1..2000 LOOP
+ EXECUTE 'SELECT * FROM dummy_table';
+ END LOOP;
+END
+$$;
+
+-- Check that the collector worked.
+SELECT COUNT(*) FROM pg_get_collected_local_advice();
+
+-- And clear one more time, to verify that this doesn't cause a problem
+-- even with a larger number of entries.
+SELECT pg_clear_collected_local_advice();
diff --git a/contrib/pg_collect_advice/t/001_shared_collector.pl b/contrib/pg_collect_advice/t/001_shared_collector.pl
new file mode 100644
index 00000000000..bba0c883e5a
--- /dev/null
+++ b/contrib/pg_collect_advice/t/001_shared_collector.pl
@@ -0,0 +1,154 @@
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test the shared advice collector.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Helper function, to avoid depending on exact line-break behavior.
+sub smash_whitespace
+{
+ my $s = shift;
+ $s =~ s/^\s+//;
+ $s =~ s/\s+$//;
+ $s =~ s/\s+/ /g;
+ return $s;
+}
+
+# Retrieve all collected shared advice as an array of whitespace-normalized
+# strings, ordered by id.
+sub get_collected_shared_advice
+{
+ my $psql = shift;
+ my $output = $psql->query_safe(
+ "SELECT string_agg(advice, '!SEPARATOR!' ORDER BY id) "
+ . "FROM pg_get_collected_shared_advice()");
+ return () if $output eq '';
+ return map { smash_whitespace($_) } split(/!SEPARATOR!/, $output);
+}
+
+# Initialize the primary node
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+
+# Load pg_collect_advice and configure a shared collection limit of 5.
+$node->append_conf('postgresql.conf', <<EOM);
+shared_preload_libraries=pg_collect_advice
+pg_collect_advice.shared_collection_limit=5
+EOM
+$node->start;
+
+# Create the extension so we can access the collector
+my $test_db = 'collection_test';
+my $test_role = 'collection_role';
+$node->safe_psql('postgres', <<EOM);
+CREATE DATABASE $test_db;
+CREATE USER $test_role;
+ALTER ROLE $test_role SET pg_collect_advice.shared_collector = on;
+EOM
+$node->safe_psql($test_db, 'CREATE EXTENSION pg_collect_advice');
+
+# Set up two connections, one to control the testing process, and the other
+# to execute the queries under test.
+my $psql_control = $node->background_psql($test_db, on_error_stop => 1);
+my $psql_test =
+ $node->background_psql($test_db, on_error_stop => 1,
+ extra_params => [ '--username' => $test_role ]);
+
+# Initial setup.
+$psql_control->query_safe(<<EOM);
+GRANT CREATE ON SCHEMA public TO $test_role;
+GRANT SET ON PARAMETER pg_collect_advice.shared_collection_limit TO $test_role;
+SET ROLE $test_role;
+CREATE TABLE sac_dim (id serial primary key, dim text)
+ WITH (autovacuum_enabled = false);
+INSERT INTO sac_dim (dim) SELECT random()::text FROM generate_series(1,100) g;
+VACUUM ANALYZE sac_dim;
+
+CREATE TABLE sac_fact (
+ id int primary key,
+ dim_id integer not null references sac_dim (id)
+) WITH (autovacuum_enabled = false);
+INSERT INTO sac_fact
+SELECT g, (g%3)+1 FROM generate_series(1,100000) g;
+CREATE INDEX sac_fact_dim_id ON sac_fact (dim_id);
+VACUUM ANALYZE sac_fact;
+RESET ROLE;
+EOM
+
+# Run a few test queries.
+$psql_test->query_safe(<<'EOM');
+SELECT * FROM sac_fact WHERE id = 42;
+SELECT * FROM sac_dim d JOIN sac_fact f ON d.id = f.dim_id;
+SELECT * FROM sac_dim d
+ WHERE d.id IN (SELECT f.dim_id FROM sac_fact f);
+EOM
+
+# Check that we got three advice collections, and the right values for each.
+my @advice = get_collected_shared_advice($psql_control);
+is(scalar @advice, 3, "three advice entries collected");
+is($advice[0], 'INDEX_SCAN(sac_fact public.sac_fact_pkey) NO_GATHER(sac_fact)',
+ "correct advice for query 1");
+is($advice[1], 'JOIN_ORDER(f d) HASH_JOIN(d) SEQ_SCAN(f d) NO_GATHER(d f)',
+ "correct advice for query 2");
+is($advice[2], 'JOIN_ORDER(d f) NESTED_LOOP_PLAIN(f) SEQ_SCAN(d) INDEX_ONLY_SCAN(f public.sac_fact_dim_id) SEMIJOIN_NON_UNIQUE(f) NO_GATHER(d f)',
+ "correct advice for query 3");
+
+# Run a few more test queries, overrunning the limit. (SET and PREPARE don't
+# trigger planning, but EXECUTE does.)
+$psql_test->query_safe(<<'EOM');
+BEGIN;
+SET LOCAL min_parallel_table_scan_size = 0;
+SET LOCAL parallel_setup_cost = 0;
+SET LOCAL parallel_tuple_cost = 0;
+SELECT count(*) FROM sac_fact;
+COMMIT;
+EXPLAIN SELECT * FROM sac_dim;
+PREPARE test_stmt AS SELECT * FROM sac_fact WHERE id = $1;
+EXECUTE test_stmt(42);
+EOM
+
+# Check that advice collection was trimmed to the configured limit.
+@advice = get_collected_shared_advice($psql_control);
+is(scalar @advice, 5, "advice trimmed to collection limit");
+
+# Check the advice for queries 4, 5, and 6.
+is($advice[2], 'SEQ_SCAN(sac_fact) GATHER(sac_fact)',
+ "correct advice for query 4");
+is($advice[3], 'SEQ_SCAN(sac_dim) NO_GATHER(sac_dim)',
+ "correct advice for query 5");
+is($advice[4],
+ 'INDEX_SCAN(sac_fact public.sac_fact_pkey) NO_GATHER(sac_fact)',
+ "correct advice for query 6");
+
+# Raise the collection limit so that we can collect enough advice to need
+# multiple chunks, and then revert back to the old value, so that we try
+# to free an entire chunk.
+$psql_test->query_safe("SET pg_collect_advice.shared_collection_limit = 1500");
+$psql_test->query_safe(<<'EOM');
+DO $$
+BEGIN
+ FOR i IN 1..1500 LOOP
+ EXECUTE 'SELECT 1';
+ END LOOP;
+END $$;
+EOM
+@advice = get_collected_shared_advice($psql_control);
+is(scalar @advice, 1500, "increased collection limit reached");
+$psql_test->query_safe("RESET pg_collect_advice.shared_collection_limit");
+$psql_test->query_safe("SELECT * FROM sac_dim");
+@advice = get_collected_shared_advice($psql_control);
+is(scalar @advice, 5, "advice trimmed across chunk boundary");
+
+# Try clearing all the advice.
+$psql_control->query_safe("SELECT pg_clear_collected_shared_advice()");
+@advice = get_collected_shared_advice($psql_control);
+is(scalar @advice, 0, "all shared advice cleared");
+
+# Clean up.
+$psql_test->quit;
+$psql_control->quit;
+done_testing();
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index bdd4865f53f..2ab6fafbab1 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -152,6 +152,7 @@ CREATE EXTENSION <replaceable>extension_name</replaceable>;
&pageinspect;
&passwordcheck;
&pgbuffercache;
+ &pgcollectadvice;
&pgcrypto;
&pgfreespacemap;
&pglogicalinspect;
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index d90b4338d2a..407ff3abffe 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -145,6 +145,7 @@
<!ENTITY pageinspect SYSTEM "pageinspect.sgml">
<!ENTITY passwordcheck SYSTEM "passwordcheck.sgml">
<!ENTITY pgbuffercache SYSTEM "pgbuffercache.sgml">
+<!ENTITY pgcollectadvice SYSTEM "pgcollectadvice.sgml">
<!ENTITY pgcrypto SYSTEM "pgcrypto.sgml">
<!ENTITY pgfreespacemap SYSTEM "pgfreespacemap.sgml">
<!ENTITY pglogicalinspect SYSTEM "pglogicalinspect.sgml">
diff --git a/doc/src/sgml/pgcollectadvice.sgml b/doc/src/sgml/pgcollectadvice.sgml
new file mode 100644
index 00000000000..220aabe78c6
--- /dev/null
+++ b/doc/src/sgml/pgcollectadvice.sgml
@@ -0,0 +1,244 @@
+<!-- doc/src/sgml/pgcollectadvice.sgml -->
+
+<sect1 id="pgcollectadvice" xreflabel="pg_collect_advice">
+ <title>pg_collect_advice — collect queries and their plan advice strings</title>
+
+ <indexterm zone="pgcollectadvice">
+ <primary>pg_collect_advice</primary>
+ </indexterm>
+
+ <para>
+ The <filename>pg_collect_advice</filename> extension allows you to
+ automatically generate plan advice each time a query is planned and store
+ the query and the generated advice string either in local or shared memory.
+ Note that this extension requires the <xref linkend="pgplanadvice" /> module,
+ which performs the actual plan advice generation; this module only knows
+ how to store the generated advice for later examination. Whenever
+ <literal>pg_collect_advice</literal> is loaded, it will automatically load
+ <literal>pg_plan_advice</literal>.
+ </para>
+
+ <para>
+ In order to use this module, you will need to execute
+ <literal>CREATE EXTENSION pg_collect_advice</literal> in at least
+ one database, so that you have a way to examine the collected advice.
+ You will also need the <literal>pg_collect_advice</literal> module
+ to be loaded in all sessions where advice is to be collected. It will
+ usually be best to do this by adding <literal>pg_collect_advice</literal>
+ to <xref linkend="guc-shared-preload-libraries"/> and restarting the
+ server.
+ </para>
+
+ <para>
+ <literal>pg_collect_advice</literal> includes both a shared advice
+ collector and a local advice collector. The local advice collector makes
+ queries and their advice strings visible only to the session where those
+ queries were planned, while the shared advice collector collects data
+ on a system-wide basis, and authorized users can examine data from all
+ sessions.
+ </para>
+
+ <para>
+ To enable a collector, you must first set a collection limit. When the
+ number of queries for which advice has been stored exceeds the collection
+ limit, the oldest queries and the corresponding advice will be discarded.
+ Then, you must adjust a separate setting to actually enable advice
+ collection. For the local collector, set the collection limit by configuring
+ <literal>pg_collect_advice.local_collection_limit</literal> to a value
+ greater than zero, and then enable advice collection by setting
+ <literal>pg_collect_advice.local_collector = true</literal>. For the shared
+ collector, the procedure is the same, except that the names of the settings
+ are <literal>pg_collect_advice.shared_collection_limit</literal> and
+ <literal>pg_collect_advice.shared_collector</literal>. Note that in both
+ cases, query texts and advice strings are stored in memory, so
+ configuring large limits may result in considerable memory consumption.
+ </para>
+
+ <para>
+ Once the collector is enabled, you can run any queries for which you wish
+ to see the generated plan advice. Then, you can examine what has been
+ collected using whichever of
+ <literal>SELECT * FROM pg_get_collected_local_advice()</literal> or
+ <literal>SELECT * FROM pg_get_collected_shared_advice()</literal>
+ corresponds to the collector you enabled. To discard the collected advice
+ and release memory, you can call
+ <literal>pg_clear_collected_local_advice()</literal>
+ or <literal>pg_clear_collected_shared_advice()</literal>.
+ </para>
+
+ <para>
+ In addition to the query texts and advice strings, the advice collectors
+ will also store the OID of the role that caused the query to be planned,
+ the OID of the database in which the query was planned, the query ID,
+ and the time at which the collection occurred. This module does not
+ automatically enable query ID computation; therefore, if you want the
+ query ID value to be populated in collected advice, be sure to configure
+ <literal>compute_query_id = on</literal>. Otherwise, the query ID may
+ always show as <literal>0</literal>.
+ </para>
+
+ <sect2 id="pgcollectadvice-functions">
+ <title>Functions</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>
+ <function>pg_clear_collected_local_advice() returns void</function>
+ <indexterm>
+ <primary>pg_clear_collected_local_advice</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Removes all collected query texts and advice strings from backend-local
+ memory.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_get_collected_local_advice() returns setof (id bigint,
+ userid oid, dbid oid, queryid bigint, collection_time timestamptz,
+ query text, advice text)</function>
+ <indexterm>
+ <primary>pg_get_collected_local_advice</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Returns all query texts and advice strings stored in the local
+ advice collector.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_clear_collected_shared_advice() returns void</function>
+ <indexterm>
+ <primary>pg_clear_collected_shared_advice</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Removes all collected query texts and advice strings from shared
+ memory.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_get_collected_shared_advice() returns setof (id bigint,
+ userid oid, dbid oid, queryid bigint, collection_time timestamptz,
+ query text, advice text)</function>
+ <indexterm>
+ <primary>pg_get_collected_shared_advice</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Returns all query texts and advice strings stored in the shared
+ advice collector.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect2>
+
+ <sect2 id="pgcollectadvice-config-params">
+ <title>Configuration Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>
+ <varname>pg_collect_advice.local_collector</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>pg_collect_advice.local_collector</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_collect_advice.local_collector</varname> enables the
+ local advice collector. The default value is <literal>false</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_collect_advice.local_collection_limit</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>pg_collect_advice.local_collection_limit</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_collect_advice.local_collection_limit</varname> sets the
+ maximum number of query texts and advice strings retained by the
+ local advice collector. The default value is <literal>0</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_collect_advice.shared_collector</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>pg_collect_advice.shared_collector</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_collect_advice.shared_collector</varname> enables the
+ shared advice collector. The default value is <literal>false</literal>.
+ Only superusers and users with the appropriate <literal>SET</literal>
+ privilege can change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_collect_advice.shared_collection_limit</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>pg_collect_advice.shared_collection_limit</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_collect_advice.shared_collection_limit</varname> sets the
+ maximum number of query texts and advice strings retained by the
+ shared advice collector. The default value is <literal>0</literal>.
+ Only superusers and users with the appropriate <literal>SET</literal>
+ privilege can change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect2>
+
+ <sect2 id="pgcollectadvice-author">
+ <title>Author</title>
+
+ <para>
+ Robert Haas <email>[email protected]</email>
+ </para>
+ </sect2>
+
+</sect1>
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 12f4b8a7bf8..7349c06a067 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4019,6 +4019,12 @@ pg_uuid_t
pg_wchar
pg_wchar_tbl
pgp_armor_headers_state
+pgca_collected_advice
+pgca_local_advice
+pgca_local_advice_chunk
+pgca_shared_advice
+pgca_shared_advice_chunk
+pgca_shared_state
pgpa_advice_item
pgpa_advice_tag_type
pgpa_advice_target
--
2.51.0
[application/octet-stream] v22-0006-Add-pg_stash_advice-contrib-module.patch (58.7K, 7-v22-0006-Add-pg_stash_advice-contrib-module.patch)
download | inline diff:
From 8230c1b08dedb519e67d626d74cbab60f314bdc8 Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Fri, 27 Feb 2026 16:58:14 -0500
Subject: [PATCH v22 6/6] Add pg_stash_advice contrib module.
This module allows plan advice strings to be provided automatically
from an in-memory advice stash. Advice stashes are stored in dynamic
shared memory and must be recreated and repopulated after a server
restart. If pg_stash_advice.stash_name is set to the name of an advice
stash, and if query identifiers are enabled, the query identifier
for each query will be looked up in the advice stash and the
associated advice string, if any, will be used each time that query
is planned.
---
contrib/Makefile | 1 +
contrib/meson.build | 1 +
contrib/pg_stash_advice/Makefile | 26 +
.../expected/pg_stash_advice.out | 328 +++++++
contrib/pg_stash_advice/meson.build | 35 +
.../pg_stash_advice/pg_stash_advice--1.0.sql | 43 +
contrib/pg_stash_advice/pg_stash_advice.c | 900 ++++++++++++++++++
.../pg_stash_advice/pg_stash_advice.control | 5 +
.../pg_stash_advice/sql/pg_stash_advice.sql | 147 +++
doc/src/sgml/contrib.sgml | 1 +
doc/src/sgml/filelist.sgml | 1 +
doc/src/sgml/pgstashadvice.sgml | 218 +++++
src/tools/pgindent/typedefs.list | 6 +
13 files changed, 1712 insertions(+)
create mode 100644 contrib/pg_stash_advice/Makefile
create mode 100644 contrib/pg_stash_advice/expected/pg_stash_advice.out
create mode 100644 contrib/pg_stash_advice/meson.build
create mode 100644 contrib/pg_stash_advice/pg_stash_advice--1.0.sql
create mode 100644 contrib/pg_stash_advice/pg_stash_advice.c
create mode 100644 contrib/pg_stash_advice/pg_stash_advice.control
create mode 100644 contrib/pg_stash_advice/sql/pg_stash_advice.sql
create mode 100644 doc/src/sgml/pgstashadvice.sgml
diff --git a/contrib/Makefile b/contrib/Makefile
index 22071034e51..06615e123f0 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -37,6 +37,7 @@ SUBDIRS = \
pg_overexplain \
pg_plan_advice \
pg_prewarm \
+ pg_stash_advice \
pg_stat_statements \
pg_surgery \
pg_trgm \
diff --git a/contrib/meson.build b/contrib/meson.build
index ff422d9b7fc..4862ba97ed1 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -52,6 +52,7 @@ subdir('pg_overexplain')
subdir('pg_plan_advice')
subdir('pg_prewarm')
subdir('pgrowlocks')
+subdir('pg_stash_advice')
subdir('pg_stat_statements')
subdir('pgstattuple')
subdir('pg_surgery')
diff --git a/contrib/pg_stash_advice/Makefile b/contrib/pg_stash_advice/Makefile
new file mode 100644
index 00000000000..cd9b7f30115
--- /dev/null
+++ b/contrib/pg_stash_advice/Makefile
@@ -0,0 +1,26 @@
+# contrib/pg_stash_advice/Makefile
+
+MODULE_big = pg_stash_advice
+OBJS = \
+ $(WIN32RES) \
+ pg_stash_advice.o
+
+EXTENSION = pg_stash_advice
+DATA = pg_stash_advice--1.0.sql
+PGFILEDESC = "pg_stash_advice - store and automatically apply plan advice"
+
+REGRESS = pg_stash_advice
+EXTRA_INSTALL = contrib/pg_plan_advice
+
+ifdef USE_PGXS
+PG_CPPFLAGS = -I$(includedir_server)/extension
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+PG_CPPFLAGS = -I$(top_srcdir)/contrib/pg_plan_advice
+subdir = contrib/pg_stash_advice
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_stash_advice/expected/pg_stash_advice.out b/contrib/pg_stash_advice/expected/pg_stash_advice.out
new file mode 100644
index 00000000000..d1e93579d8a
--- /dev/null
+++ b/contrib/pg_stash_advice/expected/pg_stash_advice.out
@@ -0,0 +1,328 @@
+CREATE EXTENSION pg_stash_advice;
+SET compute_query_id = on;
+SET max_parallel_workers_per_gather = 0;
+-- Helper: extract query identifier from EXPLAIN VERBOSE output.
+CREATE OR REPLACE FUNCTION get_query_id(query_text text) RETURNS bigint
+LANGUAGE plpgsql AS $$
+DECLARE
+ line text;
+ qid bigint;
+BEGIN
+ FOR line IN EXECUTE 'EXPLAIN (VERBOSE, FORMAT TEXT) ' || query_text
+ LOOP
+ IF line ~ 'Query Identifier:' THEN
+ qid := regexp_replace(line, '.*Query Identifier:\s*(-?\d+).*', '\1')::bigint;
+ RETURN qid;
+ END IF;
+ END LOOP;
+ RAISE EXCEPTION 'Query Identifier not found in EXPLAIN output';
+END;
+$$;
+CREATE TABLE aa_dim1 (id integer primary key, dim1 text, val1 int)
+ WITH (autovacuum_enabled = false);
+INSERT INTO aa_dim1 (id, dim1, val1)
+ SELECT g, 'some filler text ' || g, (g % 3) + 1
+ FROM generate_series(1,100) g;
+VACUUM ANALYZE aa_dim1;
+CREATE TABLE aa_dim2 (id integer primary key, dim2 text, val2 int)
+ WITH (autovacuum_enabled = false);
+INSERT INTO aa_dim2 (id, dim2, val2)
+ SELECT g, 'some filler text ' || g, (g % 7) + 1
+ FROM generate_series(1,1000) g;
+VACUUM ANALYZE aa_dim2;
+CREATE TABLE aa_fact (
+ id int primary key,
+ dim1_id integer not null references aa_dim1 (id),
+ dim2_id integer not null references aa_dim2 (id)
+) WITH (autovacuum_enabled = false);
+INSERT INTO aa_fact
+ SELECT g, (g%100)+1, (g%100)+1 FROM generate_series(1,100000) g;
+VACUUM ANALYZE aa_fact;
+-- Get the query identifier.
+SELECT get_query_id($$
+SELECT * FROM aa_fact f LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+$$) AS qid \gset
+-- Create an advice stash and point pg_stash_advice at it.
+SELECT pg_create_advice_stash('regress_stash');
+ pg_create_advice_stash
+------------------------
+
+(1 row)
+
+SET pg_stash_advice.stash_name = 'regress_stash';
+-- Run our test query for the first time with no stashed advice.
+EXPLAIN (COSTS OFF)
+SELECT * FROM aa_fact f LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+------------------------------------------
+ Hash Join
+ Hash Cond: (f.dim1_id = d1.id)
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Hash
+ -> Seq Scan on aa_dim1 d1
+ Filter: (val1 = 1)
+(11 rows)
+
+-- Force an index scan on dim1
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'INDEX_SCAN(d1 aa_dim1_pkey)');
+ pg_set_stashed_advice
+-----------------------
+
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+---------------------------------------------------------
+ Hash Join
+ Hash Cond: (f.dim1_id = d1.id)
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Hash
+ -> Index Scan using aa_dim1_pkey on aa_dim1 d1
+ Filter: (val1 = 1)
+ Supplied Plan Advice:
+ INDEX_SCAN(d1 aa_dim1_pkey) /* matched */
+(13 rows)
+
+-- Force an alternative join order
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'join_order(f d1 d2)');
+ pg_set_stashed_advice
+-----------------------
+
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+------------------------------------------
+ Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Hash Join
+ Hash Cond: (f.dim1_id = d1.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim1 d1
+ Filter: (val1 = 1)
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ Supplied Plan Advice:
+ JOIN_ORDER(f d1 d2) /* matched */
+(13 rows)
+
+-- Force an alternative join strategy
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'NESTED_LOOP_PLAIN(d1)');
+ pg_set_stashed_advice
+-----------------------
+
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+---------------------------------------------------
+ Nested Loop
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Index Scan using aa_dim1_pkey on aa_dim1 d1
+ Index Cond: (id = f.dim1_id)
+ Filter: (val1 = 1)
+ Supplied Plan Advice:
+ NESTED_LOOP_PLAIN(d1) /* matched */
+(12 rows)
+
+-- Add a useless extra entry to our test stash. Shouldn't change the result
+-- from the previous test.
+-- (If we're unlucky enough that this ever fails due to query ID actually
+-- being 1, then just put some other constant here. Seems unlikely.)
+SELECT pg_set_stashed_advice('regress_stash', 1, 'SEQ_SCAN(d1)');
+ pg_set_stashed_advice
+-----------------------
+
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+---------------------------------------------------
+ Nested Loop
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Index Scan using aa_dim1_pkey on aa_dim1 d1
+ Index Cond: (id = f.dim1_id)
+ Filter: (val1 = 1)
+ Supplied Plan Advice:
+ NESTED_LOOP_PLAIN(d1) /* matched */
+(12 rows)
+
+-- Try an empty stash to be sure it does nothing
+SELECT pg_create_advice_stash('regress_empty_stash');
+ pg_create_advice_stash
+------------------------
+
+(1 row)
+
+SET pg_stash_advice.stash_name = 'regress_empty_stash';
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+------------------------------------------
+ Hash Join
+ Hash Cond: (f.dim1_id = d1.id)
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Hash
+ -> Seq Scan on aa_dim1 d1
+ Filter: (val1 = 1)
+(11 rows)
+
+-- Test that we can list each stash individually and all of them together,
+-- but not a nonexistent stash.
+SELECT * FROM pg_get_advice_stashes() ORDER BY stash_name;
+ stash_name | num_entries
+---------------------+-------------
+ regress_empty_stash | 0
+ regress_stash | 2
+(2 rows)
+
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_stash') ORDER BY query_id;
+ stash_name | advice_string
+---------------+-----------------------
+ regress_stash | SEQ_SCAN(d1)
+ regress_stash | NESTED_LOOP_PLAIN(d1)
+(2 rows)
+
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_empty_stash') ORDER BY query_id;
+ stash_name | advice_string
+------------+---------------
+(0 rows)
+
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents(NULL) ORDER BY query_id;
+ stash_name | advice_string
+---------------+-----------------------
+ regress_stash | SEQ_SCAN(d1)
+ regress_stash | NESTED_LOOP_PLAIN(d1)
+(2 rows)
+
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('no_such_stash') ORDER BY query_id;
+ERROR: advice stash "no_such_stash" does not exist
+-- Test that we can remove advice.
+SELECT pg_set_stashed_advice('regress_stash', :'qid', null);
+ pg_set_stashed_advice
+-----------------------
+
+(1 row)
+
+SET pg_stash_advice.stash_name = 'regress_stash';
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+ QUERY PLAN
+------------------------------------------
+ Hash Join
+ Hash Cond: (f.dim1_id = d1.id)
+ -> Hash Join
+ Hash Cond: (f.dim2_id = d2.id)
+ -> Seq Scan on aa_fact f
+ -> Hash
+ -> Seq Scan on aa_dim2 d2
+ Filter: (val2 = 1)
+ -> Hash
+ -> Seq Scan on aa_dim1 d1
+ Filter: (val1 = 1)
+(11 rows)
+
+SELECT * FROM pg_get_advice_stashes() ORDER BY stash_name;
+ stash_name | num_entries
+---------------------+-------------
+ regress_empty_stash | 0
+ regress_stash | 1
+(2 rows)
+
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_stash') ORDER BY query_id;
+ stash_name | advice_string
+---------------+---------------
+ regress_stash | SEQ_SCAN(d1)
+(1 row)
+
+-- Can't create a stash that already exists, or drop one that doesn't.
+SELECT pg_create_advice_stash('regress_stash');
+ERROR: advice stash "regress_stash" already exists
+SELECT pg_drop_advice_stash('no_such_stash');
+ERROR: advice stash "no_such_stash" does not exist
+-- Can't add to or remove from a stash that does not exist.
+SELECT pg_set_stashed_advice('no_such_stash', 1, 'SEQ_SCAN(t)');
+ERROR: advice stash "no_such_stash" does not exist
+SELECT pg_set_stashed_advice('no_such_stash', 1, null);
+ERROR: advice stash "no_such_stash" does not exist
+-- Can't use query ID 0.
+SELECT pg_set_stashed_advice('regress_stash', 0, 'SEQ_SCAN(t)');
+ERROR: cannot set advice string for query ID 0
+-- Stash names must be non-empty, ASCII, and not too long.
+SELECT pg_create_advice_stash('');
+ERROR: advice stash name may not be zero length
+SELECT pg_create_advice_stash('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
+ERROR: advice stash names may not be longer than 63 bytes
+SELECT pg_create_advice_stash(E'caf\u00e9');
+ERROR: advice stash name must not contain non-ASCII characters
+SET pg_stash_advice.stash_name = 'café';
+ERROR: invalid value for parameter "pg_stash_advice.stash_name": "café"
+DETAIL: advice stash name must not contain non-ASCII characters
+-- Clean up state in dynamic shared memory.
+SELECT pg_drop_advice_stash('regress_stash');
+ pg_drop_advice_stash
+----------------------
+
+(1 row)
+
+SELECT pg_drop_advice_stash('regress_empty_stash');
+ pg_drop_advice_stash
+----------------------
+
+(1 row)
+
diff --git a/contrib/pg_stash_advice/meson.build b/contrib/pg_stash_advice/meson.build
new file mode 100644
index 00000000000..b666bcd0f1b
--- /dev/null
+++ b/contrib/pg_stash_advice/meson.build
@@ -0,0 +1,35 @@
+# Copyright (c) 2022-2026, PostgreSQL Global Development Group
+
+pg_stash_advice_sources = files(
+ 'pg_stash_advice.c'
+)
+
+if host_system == 'windows'
+ pg_stash_advice_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_stash_advice',
+ '--FILEDESC', 'pg_stash_advice - store and automatically apply plan advice',])
+endif
+
+pg_stash_advice = shared_module('pg_stash_advice',
+ pg_stash_advice_sources,
+ include_directories: [pg_plan_advice_inc, include_directories('.')],
+ kwargs: contrib_mod_args,
+)
+contrib_targets += pg_stash_advice
+
+install_data(
+ 'pg_stash_advice--1.0.sql',
+ 'pg_stash_advice.control',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'pg_stash_advice',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'pg_stash_advice',
+ ],
+ },
+}
diff --git a/contrib/pg_stash_advice/pg_stash_advice--1.0.sql b/contrib/pg_stash_advice/pg_stash_advice--1.0.sql
new file mode 100644
index 00000000000..88dedd8ef1b
--- /dev/null
+++ b/contrib/pg_stash_advice/pg_stash_advice--1.0.sql
@@ -0,0 +1,43 @@
+/* contrib/pg_stash_advice/pg_stash_advice--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_stash_advice" to load this file. \quit
+
+CREATE FUNCTION pg_create_advice_stash(stash_name text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_create_advice_stash'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_drop_advice_stash(stash_name text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_drop_advice_stash'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_set_stashed_advice(stash_name text, query_id bigint,
+ advice_string text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_set_stashed_advice'
+LANGUAGE C;
+
+CREATE FUNCTION pg_get_advice_stashes(
+ OUT stash_name text,
+ OUT num_entries bigint
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_get_advice_stashes'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_get_advice_stash_contents(
+ INOUT stash_name text,
+ OUT query_id bigint,
+ OUT advice_string text
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_get_advice_stash_contents'
+LANGUAGE C;
+
+REVOKE ALL ON FUNCTION pg_create_advice_stash(text) FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_drop_advice_stash(text) FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_get_advice_stash_contents(text) FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_get_advice_stashes() FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_set_stashed_advice(text, bigint, text) FROM PUBLIC;
diff --git a/contrib/pg_stash_advice/pg_stash_advice.c b/contrib/pg_stash_advice/pg_stash_advice.c
new file mode 100644
index 00000000000..22122236694
--- /dev/null
+++ b/contrib/pg_stash_advice/pg_stash_advice.c
@@ -0,0 +1,900 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_stash_advice.c
+ * Apply plan advice automatically, without SQL modifications.
+ *
+ * This module allows plan advice strings (as used and generated by
+ * pg_plan_advice) to be "stashed" in dynamic shared memory and, from
+ * there, automatically be applied to queries as they are planned.
+ * You can create any number of advice stashes, each of which is
+ * identified by a human-readable, ASCII name, and each of them is
+ * essentially a query ID -> advice_string mapping.
+ *
+ * Copyright (c) 2016-2026, PostgreSQL Global Development Group
+ *
+ * contrib/pg_stash_advice/pg_stash_advice.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "common/hashfn.h"
+#include "common/string.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "lib/dshash.h"
+#include "nodes/queryjumble.h"
+#include "pg_plan_advice.h"
+#include "storage/dsm_registry.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/tuplestore.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(pg_create_advice_stash);
+PG_FUNCTION_INFO_V1(pg_drop_advice_stash);
+PG_FUNCTION_INFO_V1(pg_get_advice_stash_contents);
+PG_FUNCTION_INFO_V1(pg_get_advice_stashes);
+PG_FUNCTION_INFO_V1(pg_set_stashed_advice);
+
+typedef struct pgsa_shared_state
+{
+ LWLock lock;
+ int dsa_tranche;
+ int stash_tranche;
+ int entry_tranche;
+ uint64 next_stash_id;
+ dsa_handle area;
+ dshash_table_handle stash_hash;
+ dshash_table_handle entry_hash;
+} pgsa_shared_state;
+
+typedef struct pgsa_stash
+{
+ char name[NAMEDATALEN];
+ uint64 pgsa_stash_id;
+} pgsa_stash;
+
+typedef struct pgsa_entry_key
+{
+ uint64 pgsa_stash_id;
+ int64 queryId;
+} pgsa_entry_key;
+
+typedef struct pgsa_entry
+{
+ pgsa_entry_key key;
+ dsa_pointer advice_string;
+} pgsa_entry;
+
+typedef struct pgsa_stash_count
+{
+ uint32 status;
+ uint64 pgsa_stash_id;
+ int64 num_entries;
+} pgsa_stash_count;
+
+#define SH_PREFIX pgsa_stash_count_table
+#define SH_ELEMENT_TYPE pgsa_stash_count
+#define SH_KEY_TYPE uint64
+#define SH_KEY pgsa_stash_id
+#define SH_HASH_KEY(tb, key) hash_bytes((const unsigned char *) &(key), sizeof(uint64))
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_SCOPE static inline
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
+typedef struct pgsa_stash_name
+{
+ uint32 status;
+ uint64 pgsa_stash_id;
+ char *name;
+} pgsa_stash_name;
+
+#define SH_PREFIX pgsa_stash_name_table
+#define SH_ELEMENT_TYPE pgsa_stash_name
+#define SH_KEY_TYPE uint64
+#define SH_KEY pgsa_stash_id
+#define SH_HASH_KEY(tb, key) hash_bytes((const unsigned char *) &(key), sizeof(uint64))
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_SCOPE static inline
+#define SH_DEFINE
+#define SH_DECLARE
+#include "lib/simplehash.h"
+
+/* Shared memory pointers */
+static pgsa_shared_state *pgsa_state;
+static dsa_area *pgsa_dsa_area;
+static dshash_table *pgsa_stash_dshash;
+static dshash_table *pgsa_entry_dshash;
+
+/* Shared memory hash table parameters */
+static dshash_parameters pgsa_stash_dshash_parameters = {
+ NAMEDATALEN,
+ sizeof(pgsa_stash),
+ dshash_strcmp,
+ dshash_strhash,
+ dshash_strcpy,
+ LWTRANCHE_INVALID /* gets set at runtime */
+};
+
+static dshash_parameters pgsa_entry_dshash_parameters = {
+ sizeof(pgsa_entry_key),
+ sizeof(pgsa_entry),
+ dshash_memcmp,
+ dshash_memhash,
+ dshash_memcpy,
+ LWTRANCHE_INVALID /* gets set at runtime */
+};
+
+/* GUC variable */
+static char *pg_stash_advice_stash_name = "";
+
+/* Other global variables */
+static MemoryContext pg_stash_advice_mcxt;
+
+/* Function prototypes */
+static char *pgsa_advisor(PlannerGlobal *glob,
+ Query *parse,
+ const char *query_string,
+ int cursorOptions,
+ ExplainState *es);
+static void pgsa_attach(void);
+static void pgsa_check_stash_name(char *stash_name);
+static bool pgsa_check_stash_name_guc(char **newval, void **extra,
+ GucSource source);
+static void pgsa_clear_advice_string(char *stash_name, int64 queryId);
+static void pgsa_create_stash(char *stash_name);
+static void pgsa_drop_stash(char *stash_name);
+static void pgsa_init_shared_state(void *ptr, void *arg);
+static uint64 pgsa_lookup_stash_id(char *stash_name);
+static void pgsa_set_advice_string(char *stash_name, int64 queryId,
+ char *advice_string);
+
+/*
+ * Initialize this module.
+ */
+void
+_PG_init(void)
+{
+ void (*add_advisor_fn) (pg_plan_advice_advisor_hook hook);
+
+ /* If compute_query_id = 'auto', we would like query IDs. */
+ EnableQueryId();
+
+ /* Define our GUCs. */
+ DefineCustomStringVariable("pg_stash_advice.stash_name",
+ "Name of the advice stash to be used in this session.",
+ NULL,
+ &pg_stash_advice_stash_name,
+ "",
+ PGC_USERSET,
+ 0,
+ pgsa_check_stash_name_guc,
+ NULL,
+ NULL);
+
+ MarkGUCPrefixReserved("pg_stash_advice");
+
+ /* Tell pg_plan_advice that we want to provide advice strings. */
+ add_advisor_fn =
+ load_external_function("pg_plan_advice", "pg_plan_advice_add_advisor",
+ true, NULL);
+ (*add_advisor_fn) (pgsa_advisor);
+}
+
+/*
+ * SQL-callable function to create an advice stash
+ */
+Datum
+pg_create_advice_stash(PG_FUNCTION_ARGS)
+{
+ char *stash_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+ pgsa_check_stash_name(stash_name);
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+ LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+ pgsa_create_stash(stash_name);
+ LWLockRelease(&pgsa_state->lock);
+ PG_RETURN_VOID();
+}
+
+/*
+ * SQL-callable function to drop an advice stash
+ */
+Datum
+pg_drop_advice_stash(PG_FUNCTION_ARGS)
+{
+ char *stash_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+ pgsa_check_stash_name(stash_name);
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+ LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+ pgsa_drop_stash(stash_name);
+ LWLockRelease(&pgsa_state->lock);
+ PG_RETURN_VOID();
+}
+
+/*
+ * SQL-callable function to provide a list of advice stashes
+ */
+Datum
+pg_get_advice_stashes(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ dshash_seq_status iterator;
+ pgsa_entry *entry;
+ pgsa_stash *stash;
+ pgsa_stash_count_table_hash *chash;
+
+ InitMaterializedSRF(fcinfo, 0);
+
+ /* Attach to dynamic shared memory if not already done. */
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+
+ /* Tally up the number of entries per stash. */
+ chash = pgsa_stash_count_table_create(CurrentMemoryContext, 64, NULL);
+ dshash_seq_init(&iterator, pgsa_entry_dshash, true);
+ while ((entry = dshash_seq_next(&iterator)) != NULL)
+ {
+ pgsa_stash_count *c;
+ bool found;
+
+ c = pgsa_stash_count_table_insert(chash,
+ entry->key.pgsa_stash_id,
+ &found);
+ if (!found)
+ c->num_entries = 1;
+ else
+ c->num_entries++;
+ }
+ dshash_seq_term(&iterator);
+
+ /* Emit results. */
+ dshash_seq_init(&iterator, pgsa_stash_dshash, true);
+ while ((stash = dshash_seq_next(&iterator)) != NULL)
+ {
+ Datum values[2];
+ bool nulls[2];
+ pgsa_stash_count *c;
+
+ values[0] = CStringGetTextDatum(stash->name);
+ nulls[0] = false;
+
+ c = pgsa_stash_count_table_lookup(chash, stash->pgsa_stash_id);
+ values[1] = Int64GetDatum(c == NULL ? 0 : c->num_entries);
+ nulls[1] = false;
+
+ tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values,
+ nulls);
+ }
+ dshash_seq_term(&iterator);
+
+ return (Datum) 0;
+}
+
+/*
+ * SQL-callable function to provide advice stash contents
+ */
+Datum
+pg_get_advice_stash_contents(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ dshash_seq_status iterator;
+ char *stash_name = NULL;
+ pgsa_stash_name_table_hash *nhash = NULL;
+ uint64 stash_id = 0;
+ pgsa_entry *entry;
+
+ InitMaterializedSRF(fcinfo, 0);
+
+ /* Attach to dynamic shared memory if not already done. */
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+
+ /* User can pass NULL for all stashes, or the name of a specific stash. */
+ if (!PG_ARGISNULL(0))
+ {
+ stash_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ pgsa_check_stash_name(stash_name);
+ stash_id = pgsa_lookup_stash_id(stash_name);
+
+ /* If the user specified a stash name, it should exist. */
+ if (stash_id == 0)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash \"%s\" does not exist", stash_name));
+ }
+ else
+ {
+ pgsa_stash *stash;
+
+ /*
+ * If we're dumping data about all stashes, we need an ID->name lookup
+ * table.
+ */
+ nhash = pgsa_stash_name_table_create(CurrentMemoryContext, 64, NULL);
+ dshash_seq_init(&iterator, pgsa_stash_dshash, true);
+ while ((stash = dshash_seq_next(&iterator)) != NULL)
+ {
+ pgsa_stash_name *n;
+ bool found;
+
+ n = pgsa_stash_name_table_insert(nhash,
+ stash->pgsa_stash_id,
+ &found);
+ Assert(!found);
+ n->name = pstrdup(stash->name);
+ }
+ dshash_seq_term(&iterator);
+ }
+
+ /* Now iterate over all the entries. */
+ dshash_seq_init(&iterator, pgsa_entry_dshash, false);
+ while ((entry = dshash_seq_next(&iterator)) != NULL)
+ {
+ Datum values[3];
+ bool nulls[3];
+ char *this_stash_name;
+ char *advice_string;
+
+ /* Skip incomplete entries where the advice string was never set. */
+ if (entry->advice_string == InvalidDsaPointer)
+ continue;
+
+ if (stash_id != 0)
+ {
+ /*
+ * We're only dumping data for one particular stash, so skip
+ * entries for any other stash and use the stash name specified by
+ * the user.
+ */
+ if (stash_id != entry->key.pgsa_stash_id)
+ continue;
+ this_stash_name = stash_name;
+ }
+ else
+ {
+ pgsa_stash_name *n;
+
+ /*
+ * We're dumping data for all stashes, so look up the correct name
+ * to use in the hash table. If nothing is found, which is
+ * possible due to race conditions, make up a string to use.
+ */
+ n = pgsa_stash_name_table_lookup(nhash, entry->key.pgsa_stash_id);
+ if (n != NULL)
+ this_stash_name = n->name;
+ else
+ this_stash_name = psprintf("<stash %" PRIu64 ">",
+ entry->key.pgsa_stash_id);
+ }
+
+ /* Work out tuple values. */
+ values[0] = CStringGetTextDatum(this_stash_name);
+ nulls[0] = false;
+ values[1] = Int64GetDatum(entry->key.queryId);
+ nulls[1] = false;
+ advice_string = dsa_get_address(pgsa_dsa_area, entry->advice_string);
+ values[2] = CStringGetTextDatum(advice_string);
+ nulls[2] = false;
+
+ /* Emit the tuple. */
+ tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values,
+ nulls);
+ }
+ dshash_seq_term(&iterator);
+
+ return (Datum) 0;
+}
+
+/*
+ * SQL-callable function to update an advice stash entry for a particular
+ * query ID
+ *
+ * If the second argument is NULL, we delete any existing advice stash
+ * entry; otherwise, we either create an entry or update it with the new
+ * advice string.
+ */
+Datum
+pg_set_stashed_advice(PG_FUNCTION_ARGS)
+{
+ char *stash_name;
+ int64 queryId;
+
+ if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
+ PG_RETURN_NULL();
+
+ /* Get and check advice stash name. */
+ stash_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ pgsa_check_stash_name(stash_name);
+
+ /*
+ * Get and check query ID.
+ *
+ * queryID 0 means no query ID was computed, so reject that.
+ */
+ queryId = PG_GETARG_INT64(1);
+ if (queryId == 0)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot set advice string for query ID 0"));
+
+ /* Attach to dynamic shared memory if not already done. */
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+
+ /* Now call the appropriate function to do the real work. */
+ if (PG_ARGISNULL(2))
+ pgsa_clear_advice_string(stash_name, queryId);
+ else
+ {
+ char *advice_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
+
+ pgsa_set_advice_string(stash_name, queryId, advice_string);
+ }
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Get the advice string that has been configured for this query, if any,
+ * and return it. Otherwise, return NULL.
+ */
+static char *
+pgsa_advisor(PlannerGlobal *glob, Query *parse,
+ const char *query_string, int cursorOptions,
+ ExplainState *es)
+{
+ pgsa_entry_key key;
+ pgsa_entry *entry;
+ char *advice_string;
+ uint64 stash_id;
+
+ /*
+ * Exit quickly if the stash name is empty or there's no query ID.
+ */
+ if (pg_stash_advice_stash_name[0] == '\0' || parse->queryId == 0)
+ return NULL;
+
+ /* Attach to dynamic shared memory if not already done. */
+ if (unlikely(pgsa_entry_dshash == NULL))
+ pgsa_attach();
+
+ /*
+ * Translate pg_stash_advice.stash_name to an integer ID.
+ *
+ * pgsa_check_stash_name_guc() has already validated the advice stash
+ * name, so we don't need to call pgsa_check_stash_name() here.
+ */
+ stash_id = pgsa_lookup_stash_id(pg_stash_advice_stash_name);
+ if (stash_id == 0)
+ return NULL;
+
+ /*
+ * Look up the advice string for the given stash ID + query ID.
+ *
+ * If we find an advice string, we copy it into the current memory
+ * context, presumably short-lived, so that we can release the lock on the
+ * dshash entry. pg_plan_advice only needs the value to remain allocated
+ * long enough for it to be parsed, so this should be good enough.
+ */
+ memset(&key, 0, sizeof(pgsa_entry_key));
+ key.pgsa_stash_id = stash_id;
+ key.queryId = parse->queryId;
+ entry = dshash_find(pgsa_entry_dshash, &key, false);
+ if (entry == NULL)
+ return NULL;
+ if (entry->advice_string == InvalidDsaPointer)
+ advice_string = NULL;
+ else
+ advice_string = pstrdup(dsa_get_address(pgsa_dsa_area,
+ entry->advice_string));
+ dshash_release_lock(pgsa_entry_dshash, entry);
+
+ /* If we found an advice string, emit a debug message. */
+ if (advice_string != NULL)
+ elog(DEBUG2, "supplying automatic advice for stash \"%s\", query ID %" PRId64 ": %s",
+ pg_stash_advice_stash_name, key.queryId, advice_string);
+
+ return advice_string;
+}
+
+/*
+ * Attach to various structures in dynamic shared memory.
+ *
+ * This function is designed to be resilient against errors. That is, if it
+ * fails partway through, it should be possible to call it again, repeat no
+ * work already completed, and potentially succeed or at least get further if
+ * whatever caused the previous failure has been corrected.
+ */
+static void
+pgsa_attach(void)
+{
+ bool found;
+ MemoryContext oldcontext;
+
+ /*
+ * Create a memory context to make sure that any control structures
+ * allocated in local memory are sufficiently persistent.
+ */
+ if (pg_stash_advice_mcxt == NULL)
+ pg_stash_advice_mcxt = AllocSetContextCreate(TopMemoryContext,
+ "pg_stash_advice",
+ ALLOCSET_DEFAULT_SIZES);
+ oldcontext = MemoryContextSwitchTo(pg_stash_advice_mcxt);
+
+ /* Attach to the fixed-size state object if not already done. */
+ if (pgsa_state == NULL)
+ pgsa_state = GetNamedDSMSegment("pg_stash_advice",
+ sizeof(pgsa_shared_state),
+ pgsa_init_shared_state,
+ &found, NULL);
+
+ /* Attach to the DSA area if not already done. */
+ if (pgsa_dsa_area == NULL)
+ {
+ dsa_handle area_handle;
+
+ LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+ area_handle = pgsa_state->area;
+ if (area_handle == DSA_HANDLE_INVALID)
+ {
+ pgsa_dsa_area = dsa_create(pgsa_state->dsa_tranche);
+ dsa_pin(pgsa_dsa_area);
+ pgsa_state->area = dsa_get_handle(pgsa_dsa_area);
+ LWLockRelease(&pgsa_state->lock);
+ }
+ else
+ {
+ LWLockRelease(&pgsa_state->lock);
+ pgsa_dsa_area = dsa_attach(area_handle);
+ }
+ dsa_pin_mapping(pgsa_dsa_area);
+ }
+
+ /* Attach to the stash_name->stash_id hash table if not already done. */
+ if (pgsa_stash_dshash == NULL)
+ {
+ dshash_table_handle stash_handle;
+
+ LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+ pgsa_stash_dshash_parameters.tranche_id = pgsa_state->stash_tranche;
+ stash_handle = pgsa_state->stash_hash;
+ if (stash_handle == DSHASH_HANDLE_INVALID)
+ {
+ pgsa_stash_dshash = dshash_create(pgsa_dsa_area,
+ &pgsa_stash_dshash_parameters,
+ NULL);
+ pgsa_state->stash_hash =
+ dshash_get_hash_table_handle(pgsa_stash_dshash);
+ LWLockRelease(&pgsa_state->lock);
+ }
+ else
+ {
+ LWLockRelease(&pgsa_state->lock);
+ pgsa_stash_dshash = dshash_attach(pgsa_dsa_area,
+ &pgsa_stash_dshash_parameters,
+ stash_handle, NULL);
+ }
+ }
+
+ /* Attach to the entry hash table if not already done. */
+ if (pgsa_entry_dshash == NULL)
+ {
+ dshash_table_handle entry_handle;
+
+ LWLockAcquire(&pgsa_state->lock, LW_EXCLUSIVE);
+ pgsa_entry_dshash_parameters.tranche_id = pgsa_state->entry_tranche;
+ entry_handle = pgsa_state->entry_hash;
+ if (entry_handle == DSHASH_HANDLE_INVALID)
+ {
+ pgsa_entry_dshash = dshash_create(pgsa_dsa_area,
+ &pgsa_entry_dshash_parameters,
+ NULL);
+ pgsa_state->entry_hash =
+ dshash_get_hash_table_handle(pgsa_entry_dshash);
+ LWLockRelease(&pgsa_state->lock);
+ }
+ else
+ {
+ LWLockRelease(&pgsa_state->lock);
+ pgsa_entry_dshash = dshash_attach(pgsa_dsa_area,
+ &pgsa_entry_dshash_parameters,
+ entry_handle, NULL);
+ }
+ }
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Check whether an advice stash name is legal, and signal an error if not.
+ *
+ * Keep this in sync with pgsa_check_stash_name_guc, below.
+ */
+static void
+pgsa_check_stash_name(char *stash_name)
+{
+ /* Reject empty advice stash name. */
+ if (stash_name[0] == '\0')
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash name may not be zero length"));
+
+ /* Reject overlong advice stash names. */
+ if (strlen(stash_name) + 1 > NAMEDATALEN)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash names may not be longer than %d bytes",
+ NAMEDATALEN - 1));
+
+ /*
+ * Reject non-ASCII advice stash names, since advice stashes are visible
+ * across all databases and the encodings of those databases might differ.
+ */
+ if (!pg_is_ascii(stash_name))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash name must not contain non-ASCII characters"));
+}
+
+/*
+ * As above, but for the GUC check_hook. We allow the empty string here,
+ * though, as equivalent to disabling the feature.
+ */
+static bool
+pgsa_check_stash_name_guc(char **newval, void **extra, GucSource source)
+{
+ char *stash_name = *newval;
+
+ /* Reject overlong advice stash names. */
+ if (strlen(stash_name) + 1 > NAMEDATALEN)
+ {
+ GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE);
+ GUC_check_errdetail("advice stash names may not be longer than %d bytes",
+ NAMEDATALEN - 1);
+ return false;
+ }
+
+ /*
+ * Reject non-ASCII advice stash names, since advice stashes are visible
+ * across all databases and the encodings of those databases might differ.
+ */
+ if (!pg_is_ascii(stash_name))
+ {
+ GUC_check_errcode(ERRCODE_INVALID_PARAMETER_VALUE);
+ GUC_check_errdetail("advice stash name must not contain non-ASCII characters");
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * Create an advice stash.
+ */
+static void
+pgsa_create_stash(char *stash_name)
+{
+ pgsa_stash *stash;
+ bool found;
+
+ Assert(LWLockHeldByMeInMode(&pgsa_state->lock, LW_EXCLUSIVE));
+
+ /* Create a stash with this name, unless one already exists. */
+ stash = dshash_find_or_insert(pgsa_stash_dshash, stash_name, &found);
+ if (found)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash \"%s\" already exists", stash_name));
+ stash->pgsa_stash_id = pgsa_state->next_stash_id++;
+ dshash_release_lock(pgsa_stash_dshash, stash);
+}
+
+/*
+ * Remove any stored advice string for the given advice stash and query ID.
+ */
+static void
+pgsa_clear_advice_string(char *stash_name, int64 queryId)
+{
+ pgsa_entry *entry;
+ pgsa_entry_key key;
+ uint64 stash_id;
+ dsa_pointer old_dp;
+
+ /* Translate the stash name to an integer ID. */
+ if ((stash_id = pgsa_lookup_stash_id(stash_name)) == 0)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash \"%s\" does not exist", stash_name));
+
+ /*
+ * Look for an existing entry, and free it. But, be sure to save the
+ * pointer to the associated advice string, if any.
+ */
+ memset(&key, 0, sizeof(pgsa_entry_key));
+ key.pgsa_stash_id = stash_id;
+ key.queryId = queryId;
+ entry = dshash_find(pgsa_entry_dshash, &key, true);
+ if (entry == NULL)
+ old_dp = InvalidDsaPointer;
+ else
+ {
+ old_dp = entry->advice_string;
+ dshash_delete_entry(pgsa_entry_dshash, entry);
+ }
+
+ /* Now we free the advice string as well, if there was one. */
+ if (old_dp != InvalidDsaPointer)
+ dsa_free(pgsa_dsa_area, old_dp);
+}
+
+/*
+ * Drop an advice stash.
+ */
+static void
+pgsa_drop_stash(char *stash_name)
+{
+ pgsa_entry *entry;
+ pgsa_stash *stash;
+ dshash_seq_status iterator;
+ uint64 stash_id;
+
+ Assert(LWLockHeldByMeInMode(&pgsa_state->lock, LW_EXCLUSIVE));
+
+ /* Remove the entry for this advice stash. */
+ stash = dshash_find(pgsa_stash_dshash, stash_name, true);
+ if (stash == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash \"%s\" does not exist", stash_name));
+ stash_id = stash->pgsa_stash_id;
+ dshash_delete_entry(pgsa_stash_dshash, stash);
+
+ /*
+ * Now remove all the entries. Since pgsa_state->lock must be held at
+ * least in shared mode to insert entries into pgsa_entry_dshash, it
+ * doesn't matter whether we do this before or after deleting the entry
+ * from pgsa_stash_dshash.
+ */
+ dshash_seq_init(&iterator, pgsa_entry_dshash, true);
+ while ((entry = dshash_seq_next(&iterator)) != NULL)
+ {
+ if (stash_id == entry->key.pgsa_stash_id)
+ {
+ if (entry->advice_string != InvalidDsaPointer)
+ dsa_free(pgsa_dsa_area, entry->advice_string);
+ dshash_delete_current(&iterator);
+ }
+ }
+ dshash_seq_term(&iterator);
+}
+
+/*
+ * Initialize shared state when first created.
+ */
+static void
+pgsa_init_shared_state(void *ptr, void *arg)
+{
+ pgsa_shared_state *state = (pgsa_shared_state *) ptr;
+
+ LWLockInitialize(&state->lock,
+ LWLockNewTrancheId("pg_stash_advice_lock"));
+ state->dsa_tranche = LWLockNewTrancheId("pg_stash_advice_dsa");
+ state->stash_tranche = LWLockNewTrancheId("pg_stash_advice_stash");
+ state->entry_tranche = LWLockNewTrancheId("pg_stash_advice_entry");
+ state->next_stash_id = UINT64CONST(1);
+ state->area = DSA_HANDLE_INVALID;
+ state->stash_hash = DSHASH_HANDLE_INVALID;
+ state->entry_hash = DSHASH_HANDLE_INVALID;
+}
+
+/*
+ * Look up the integer ID that corresponds to the given stash name.
+ *
+ * Returns 0 if no such stash exists.
+ */
+static uint64
+pgsa_lookup_stash_id(char *stash_name)
+{
+ pgsa_stash *stash;
+ uint64 stash_id;
+
+ /* Search the shared hash table. */
+ stash = dshash_find(pgsa_stash_dshash, stash_name, false);
+ if (stash == NULL)
+ return 0;
+ stash_id = stash->pgsa_stash_id;
+ dshash_release_lock(pgsa_stash_dshash, stash);
+
+ return stash_id;
+}
+
+/*
+ * Store a new or updated advice string for the given advice stash and query ID.
+ */
+static void
+pgsa_set_advice_string(char *stash_name, int64 queryId, char *advice_string)
+{
+ pgsa_entry *entry;
+ bool found;
+ pgsa_entry_key key;
+ uint64 stash_id;
+ dsa_pointer new_dp;
+ dsa_pointer old_dp;
+
+ /*
+ * The work we need to do in this function is basically simple, but the
+ * danger of a server-lifespan DSA memory leak is very real. Acquiring a
+ * lock here helps for two reasons.
+ *
+ * First, it holds off interrupts, so that we can't bail out of this code
+ * after allocating DSA memory for the advice string and before storing
+ * the resulting pointer somewhere that others can find it.
+ *
+ * Second, we need to avoid a race against pgsa_drop_stash(). That
+ * function removes a stash_name->stash_id mapping and all the entries for
+ * that stash_id. Without the lock, there's a race condition no matter
+ * which of those things it does first, because as soon as we've looked up
+ * the stash ID, that whole function can execute before we do the rest of
+ * our work, which would result in us adding an entry for a stash that no
+ * longer exists.
+ */
+ LWLockAcquire(&pgsa_state->lock, LW_SHARED);
+
+ /* Look up the stash ID. */
+ if ((stash_id = pgsa_lookup_stash_id(stash_name)) == 0)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("advice stash \"%s\" does not exist", stash_name));
+
+ /* Allocate space for the advice string. */
+ new_dp = dsa_allocate(pgsa_dsa_area, strlen(advice_string) + 1);
+ strcpy(dsa_get_address(pgsa_dsa_area, new_dp), advice_string);
+
+ /* Attempt to insert an entry into the hash table. */
+ memset(&key, 0, sizeof(pgsa_entry_key));
+ key.pgsa_stash_id = stash_id;
+ key.queryId = queryId;
+ entry = dshash_find_or_insert_extended(pgsa_entry_dshash, &key, &found,
+ DSHASH_INSERT_NO_OOM);
+
+ /*
+ * If it didn't work, bail out, being careful to free the shared memory
+ * we've already allocated before throwing an error, since error cleanup
+ * will not do so.
+ */
+ if (entry == NULL)
+ {
+ dsa_free(pgsa_dsa_area, new_dp);
+ ereport(ERROR,
+ errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory"),
+ errdetail("could not insert advice string into shared hash table"));
+ }
+
+ /* Update the entry and release the lock. */
+ old_dp = found ? entry->advice_string : InvalidDsaPointer;
+ entry->advice_string = new_dp;
+ dshash_release_lock(pgsa_entry_dshash, entry);
+
+ /*
+ * We're not safe from leaks yet!
+ *
+ * There's now a pointer to new_dp in the entry that we just updated, but
+ * that means that there's no longer anything pointing to old_dp. Free it
+ * first, and then we can release our last LWLock, allowing interrupts.
+ */
+ if (DsaPointerIsValid(old_dp))
+ dsa_free(pgsa_dsa_area, old_dp);
+ LWLockRelease(&pgsa_state->lock);
+}
diff --git a/contrib/pg_stash_advice/pg_stash_advice.control b/contrib/pg_stash_advice/pg_stash_advice.control
new file mode 100644
index 00000000000..4a0fff5c866
--- /dev/null
+++ b/contrib/pg_stash_advice/pg_stash_advice.control
@@ -0,0 +1,5 @@
+# pg_stash_advice extension
+comment = 'store and automatically apply plan advice'
+default_version = '1.0'
+module_pathname = '$libdir/pg_stash_advice'
+relocatable = true
diff --git a/contrib/pg_stash_advice/sql/pg_stash_advice.sql b/contrib/pg_stash_advice/sql/pg_stash_advice.sql
new file mode 100644
index 00000000000..3f6bfb83114
--- /dev/null
+++ b/contrib/pg_stash_advice/sql/pg_stash_advice.sql
@@ -0,0 +1,147 @@
+CREATE EXTENSION pg_stash_advice;
+SET compute_query_id = on;
+SET max_parallel_workers_per_gather = 0;
+
+-- Helper: extract query identifier from EXPLAIN VERBOSE output.
+CREATE OR REPLACE FUNCTION get_query_id(query_text text) RETURNS bigint
+LANGUAGE plpgsql AS $$
+DECLARE
+ line text;
+ qid bigint;
+BEGIN
+ FOR line IN EXECUTE 'EXPLAIN (VERBOSE, FORMAT TEXT) ' || query_text
+ LOOP
+ IF line ~ 'Query Identifier:' THEN
+ qid := regexp_replace(line, '.*Query Identifier:\s*(-?\d+).*', '\1')::bigint;
+ RETURN qid;
+ END IF;
+ END LOOP;
+ RAISE EXCEPTION 'Query Identifier not found in EXPLAIN output';
+END;
+$$;
+
+CREATE TABLE aa_dim1 (id integer primary key, dim1 text, val1 int)
+ WITH (autovacuum_enabled = false);
+INSERT INTO aa_dim1 (id, dim1, val1)
+ SELECT g, 'some filler text ' || g, (g % 3) + 1
+ FROM generate_series(1,100) g;
+VACUUM ANALYZE aa_dim1;
+
+CREATE TABLE aa_dim2 (id integer primary key, dim2 text, val2 int)
+ WITH (autovacuum_enabled = false);
+INSERT INTO aa_dim2 (id, dim2, val2)
+ SELECT g, 'some filler text ' || g, (g % 7) + 1
+ FROM generate_series(1,1000) g;
+VACUUM ANALYZE aa_dim2;
+
+CREATE TABLE aa_fact (
+ id int primary key,
+ dim1_id integer not null references aa_dim1 (id),
+ dim2_id integer not null references aa_dim2 (id)
+) WITH (autovacuum_enabled = false);
+INSERT INTO aa_fact
+ SELECT g, (g%100)+1, (g%100)+1 FROM generate_series(1,100000) g;
+VACUUM ANALYZE aa_fact;
+
+-- Get the query identifier.
+SELECT get_query_id($$
+SELECT * FROM aa_fact f LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+$$) AS qid \gset
+
+-- Create an advice stash and point pg_stash_advice at it.
+SELECT pg_create_advice_stash('regress_stash');
+SET pg_stash_advice.stash_name = 'regress_stash';
+
+-- Run our test query for the first time with no stashed advice.
+EXPLAIN (COSTS OFF)
+SELECT * FROM aa_fact f LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Force an index scan on dim1
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'INDEX_SCAN(d1 aa_dim1_pkey)');
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Force an alternative join order
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'join_order(f d1 d2)');
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Force an alternative join strategy
+SELECT pg_set_stashed_advice('regress_stash', :'qid',
+ 'NESTED_LOOP_PLAIN(d1)');
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Add a useless extra entry to our test stash. Shouldn't change the result
+-- from the previous test.
+-- (If we're unlucky enough that this ever fails due to query ID actually
+-- being 1, then just put some other constant here. Seems unlikely.)
+SELECT pg_set_stashed_advice('regress_stash', 1, 'SEQ_SCAN(d1)');
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Try an empty stash to be sure it does nothing
+SELECT pg_create_advice_stash('regress_empty_stash');
+SET pg_stash_advice.stash_name = 'regress_empty_stash';
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+
+-- Test that we can list each stash individually and all of them together,
+-- but not a nonexistent stash.
+SELECT * FROM pg_get_advice_stashes() ORDER BY stash_name;
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_stash') ORDER BY query_id;
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_empty_stash') ORDER BY query_id;
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents(NULL) ORDER BY query_id;
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('no_such_stash') ORDER BY query_id;
+
+-- Test that we can remove advice.
+SELECT pg_set_stashed_advice('regress_stash', :'qid', null);
+SET pg_stash_advice.stash_name = 'regress_stash';
+EXPLAIN (COSTS OFF) SELECT * FROM aa_fact f
+ LEFT JOIN aa_dim1 d1 ON f.dim1_id = d1.id
+ LEFT JOIN aa_dim2 d2 ON f.dim2_id = d2.id
+ WHERE val1 = 1 AND val2 = 1;
+SELECT * FROM pg_get_advice_stashes() ORDER BY stash_name;
+SELECT stash_name, advice_string
+ FROM pg_get_advice_stash_contents('regress_stash') ORDER BY query_id;
+
+-- Can't create a stash that already exists, or drop one that doesn't.
+SELECT pg_create_advice_stash('regress_stash');
+SELECT pg_drop_advice_stash('no_such_stash');
+
+-- Can't add to or remove from a stash that does not exist.
+SELECT pg_set_stashed_advice('no_such_stash', 1, 'SEQ_SCAN(t)');
+SELECT pg_set_stashed_advice('no_such_stash', 1, null);
+
+-- Can't use query ID 0.
+SELECT pg_set_stashed_advice('regress_stash', 0, 'SEQ_SCAN(t)');
+
+-- Stash names must be non-empty, ASCII, and not too long.
+SELECT pg_create_advice_stash('');
+SELECT pg_create_advice_stash('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
+SELECT pg_create_advice_stash(E'caf\u00e9');
+SET pg_stash_advice.stash_name = 'café';
+
+-- Clean up state in dynamic shared memory.
+SELECT pg_drop_advice_stash('regress_stash');
+SELECT pg_drop_advice_stash('regress_empty_stash');
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index 2ab6fafbab1..8f09d728698 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -160,6 +160,7 @@ CREATE EXTENSION <replaceable>extension_name</replaceable>;
&pgplanadvice;
&pgprewarm;
&pgrowlocks;
+ &pgstashadvice;
&pgstatstatements;
&pgstattuple;
&pgsurgery;
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 407ff3abffe..8c14bab84e9 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -144,6 +144,7 @@
<!ENTITY oid2name SYSTEM "oid2name.sgml">
<!ENTITY pageinspect SYSTEM "pageinspect.sgml">
<!ENTITY passwordcheck SYSTEM "passwordcheck.sgml">
+<!ENTITY pgstashadvice SYSTEM "pgstashadvice.sgml">
<!ENTITY pgbuffercache SYSTEM "pgbuffercache.sgml">
<!ENTITY pgcollectadvice SYSTEM "pgcollectadvice.sgml">
<!ENTITY pgcrypto SYSTEM "pgcrypto.sgml">
diff --git a/doc/src/sgml/pgstashadvice.sgml b/doc/src/sgml/pgstashadvice.sgml
new file mode 100644
index 00000000000..089fc66446f
--- /dev/null
+++ b/doc/src/sgml/pgstashadvice.sgml
@@ -0,0 +1,218 @@
+<!-- doc/src/sgml/pgstashadvice.sgml -->
+
+<sect1 id="pgstashadvice" xreflabel="pg_stash_advice">
+ <title>pg_stash_advice — store and automatically apply plan advice</title>
+
+ <indexterm zone="pgstashadvice">
+ <primary>pg_stash_advice</primary>
+ </indexterm>
+
+ <para>
+ The <filename>pg_stash_advice</filename> extension allows you to stash
+ <link linkend="pgplanadvice">plan advice</link> strings in dynamic
+ shared memory where they can be automatically applied. An
+ <literal>advice stash</literal> is a mapping from
+ <link linkend="guc-compute-query-id">query identifiers</link> to plan advice
+ strings. Whenever a session is asked to plan a query whose query ID appears
+ in the relevant advice stash, the plan advice string is automatically applied
+ to guide planning. Note that advice stashes exist purely in memory. This
+ means both that it is important to be mindful of memory consumption when
+ deciding how much plan advice to stash, and also that advice stashes must
+ be recreated and repopulated whenever the server is restarted.
+ </para>
+
+ <para>
+ In order to use this module, you will need to execute
+ <literal>CREATE EXTENSION pg_stash_advice</literal> in at least
+ one database, so that you have access to the SQL functions to manage
+ advice stashes. You will also need the <literal>pg_stash_advice</literal>
+ module to be loaded in all sessions where you want this module to
+ automatically apply advice. It will usually be best to do this by adding
+ <literal>pg_stash_advice</literal> to
+ <xref linkend="guc-shared-preload-libraries"/> and restarting the server.
+ </para>
+
+ <para>
+ Once you have met the above criteria, you can create advice stashes
+ using the <literal>pg_create_advice_stash</literal> function described
+ below and set the plan advice for a given query ID in a given stash using
+ the <literal>pg_set_stashed_advice</literal> function. Then, you need
+ only configure <literal>pg_stash_advice.stash_name</literal> to point
+ to the chosen advice stash name. For some use cases, rather than setting
+ this on a system-wide basis, you may find it helpful to use
+ <literal>ALTER DATABASE ... SET</literal> or
+ <literal>ALTER ROLE ... SET</literal> to configure values that will apply
+ only to a database or only to a certain role. Likewise, it may sometimes
+ be better to set the stash name in a particular session using
+ <literal>SET</literal>.
+ </para>
+
+ <para>
+ Because <literal>pg_stash_advice</literal> works on the basis of query
+ identifiers, you will need to determine the query identifier for each query
+ whose plan you wish to control. You will also need to determine the advice
+ string that you wish to store for each query. One way to do this is to use
+ <literal>EXPLAIN</literal>: the <literal>VERBOSE</literal> option will
+ show the query ID, and the <literal>PLAN_ADVICE</literal> option will
+ show plan advice. <xref linkend="pgcollectadvice" /> can be used to
+ obtain this information for an entire workload, although care must be
+ taken since it can use up a lot of memory very quickly. Query identifiers can
+ also be obtained through tools such as <xref linkend="pgstatstatements" />
+ or <xref linkend="monitoring-pg-stat-activity-view" />, but these tools
+ will not provide plan advice strings. Note that
+ <xref linkend="guc-compute-query-id" /> must be enabled for query
+ identifiers to be computed; if set to <literal>auto</literal>, loading
+ <literal>pg_stash_advice</literal> will enable it automatically.
+ </para>
+
+ <para>
+ Generally, the fact that the planner is able to change query plans as
+ the underlying distribution of data changes is a feature, not a bug.
+ Moreover, applying plan advice can have a noticeable performance cost even
+ when it does not result in a change to the query plan. Therefore, it is
+ a good idea to use this feature only when and to the extent needed.
+ Plan advice strings can be trimmed down to mention only those aspects
+ of the plan that need to be controlled, and used only for queries where
+ there is believed to be a significant risk of planner error.
+ </para>
+
+ <para>
+ Note that <literal>pg_stash_advice</literal> currently lacks a sophisticated
+ security model. Only the superuser, or a user to whom the superuser has
+ granted <literal>EXECUTE</literal> permission on the relevant functions,
+ may create advice stashes or alter their contents, but any user may set
+ <literal>pg_stash_advice.stash_name</literal> for their session, and this
+ may reveal the contents of any advice stash with that name. Users should
+ assume that information embedded in stashed advice strings may become visible
+ to nonprivileged users.
+ </para>
+
+ <sect2 id="pgstashadvice-functions">
+ <title>Functions</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>
+ <function>pg_create_advice_stash(stash_name text) returns void</function>
+ <indexterm>
+ <primary>pg_create_advice_stash</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Creates a new, empty advice stash with the given name.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_drop_advice_stash(stash_name text) returns void</function>
+ <indexterm>
+ <primary>pg_drop_advice_stash</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Drops the named advice stash and all of its entries.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_set_stashed_advice(stash_name text, query_id bigint,
+ advice_string text) returns void</function>
+ <indexterm>
+ <primary>pg_set_stashed_advice</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Stores an advice string in the named advice stash, associated with
+ the given query identifier. If an entry for that query identifier
+ already exists in the stash, it is replaced. If
+ <parameter>advice_string</parameter> is <literal>NULL</literal>,
+ any existing entry for that query identifier is removed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_get_advice_stashes() returns setof (stash_name text,
+ num_entries bigint)</function>
+ <indexterm>
+ <primary>pg_get_advice_stashes</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Returns one row for each advice stash, showing the stash name and
+ the number of entries it contains.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_get_advice_stash_contents(stash_name text) returns setof
+ (stash_name text, query_id bigint, advice_string text)</function>
+ <indexterm>
+ <primary>pg_get_advice_stash_contents</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Returns one row for each entry in the named advice stash. If
+ <parameter>stash_name</parameter> is <literal>NULL</literal>, returns
+ entries from all stashes.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect2>
+
+ <sect2 id="pgstashadvice-config-params">
+ <title>Configuration Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stash_advice.stash_name</varname> (<type>string</type>)
+ <indexterm>
+ <primary><varname>pg_stash_advice.stash_name</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Specifies the name of the advice stash to consult during query
+ planning. The default value is the empty string, which disables
+ this module.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </sect2>
+
+ <sect2 id="pgstashadvice-author">
+ <title>Author</title>
+
+ <para>
+ Robert Haas <email>[email protected]</email>
+ </para>
+ </sect2>
+
+</sect1>
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7349c06a067..0000e56d55c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4056,6 +4056,12 @@ pgpa_trove_lookup_type
pgpa_trove_result
pgpa_trove_slice
pgpa_unrolled_join
+pgsa_entry
+pgsa_entry_key
+pgsa_shared_state
+pgsa_stash
+pgsa_stash_count
+pgsa_stash_name
pgsocket
pgsql_thing_t
pgssEntry
--
2.51.0
view thread (184+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: pg_plan_advice
In-Reply-To: <CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox