public inbox for [email protected]
help / color / mirror / Atom feedFrom: Amit Langote <[email protected]>
To: Robert Haas <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Daniel Gustafsson <[email protected]>
Cc: David Rowley <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Thom Brown <[email protected]>
Cc: Tom Lane <[email protected]>
Subject: Re: generic plans and "initial" pruning
Date: Thu, 19 Sep 2024 21:10:04 +0900
Message-ID: <CA+HiwqFGz2uShfU=qtack9gii6Kzyjv1V66tJJBYBN8Acb4uTA@mail.gmail.com> (raw)
In-Reply-To: <CA+HiwqGBpw_JNwkwZjQ2YaqTWrDjn9L5jpuc+nS8=a55SPD+UA@mail.gmail.com>
References: <CA+HiwqFpZ80UJKr4tZus4Omgg7YESzFXKSwSHRW2Ap2=XSVyUA@mail.gmail.com>
<[email protected]>
<CA+HiwqF+3tv=tuB9EVfOj9YcXhSq477X+1RKOpJ5JqCCj3qgww@mail.gmail.com>
<CA+TgmobHL_vTjOdy6KVMVeW-CQQmXXz_yU6Q9d2YjnVfFxuy6A@mail.gmail.com>
<CA+HiwqHL=aGU9Y4RYXQ5VCp4L5NVdiaQLLoXN3NCQQQMKo0ByQ@mail.gmail.com>
<CA+TgmoabYD=pnccFLzbbREFsqkFgE4EZ+FdHoTOhgCqn4jP2Cw@mail.gmail.com>
<CA+HiwqE_rQ9pZnkXeoHdds2kgAiT7XNNHZW8gTGicBdXv0rwnw@mail.gmail.com>
<CA+TgmoY2drv9PmrRAC7AR77mkx09sOh-+5qJkHB_hLKeHRNqzQ@mail.gmail.com>
<CA+HiwqHkjicWzfAjB6_SVsVmKF6omQ4EBHr+GTUgJNN7WiUDag@mail.gmail.com>
<CA+TgmoaZxb4JTimK8MmbXEeCwtzyfx7uGYjq565s2pY9i1GN+Q@mail.gmail.com>
<CA+HiwqHzKO9FT-CjFWo6OmkiCSYmbPspKXVex96tOBKf6S_x_w@mail.gmail.com>
<CA+TgmoZGWyMXutfen-NNv9=QM7eCHn9R1bpLZ9N4sRURMOCK2A@mail.gmail.com>
<CA+HiwqHNb9jrwOFHfmASfiGc=SnqXs7THwQ_Rta=z+ognYV8qw@mail.gmail.com>
<CA+HiwqH9u1RWn9OEa=VQQpJagB0hDLCY+=fSyBC4ZkeU6Gg2HA@mail.gmail.com>
<CA+HiwqFMWt2MQVqhp2rZA8=ugPVD=5uW10QCdK_NpoyWyFLe-g@mail.gmail.com>
<CA+HiwqGBpw_JNwkwZjQ2YaqTWrDjn9L5jpuc+nS8=a55SPD+UA@mail.gmail.com>
On Thu, Sep 19, 2024 at 5:39 PM Amit Langote <[email protected]> wrote:
> For
> ResultRelInfos, I took the approach of memsetting them to 0 for pruned
> result relations and adding checks at multiple sites to ensure the
> ResultRelInfo being handled is valid.
After some reflection, I realized that nobody would think that that
approach is very robust. In the attached, I’ve modified
ExecInitModifyTable() to allocate ResultRelInfos only for unpruned
relations, instead of allocating for all in
ModifyTable.resultRelations and setting pruned ones to 0. This
approach feels more robust.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v55-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (19.9K, 2-v55-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From cf75d48323a3c28d272e34c942f123a2e04044fd Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 6 Sep 2024 13:11:05 +0900
Subject: [PATCH v55 1/5] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, allowing runtime initial pruning to be performed across
the entire plan tree without traversing the tree to find nodes
containing PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to
PartitionPruneInfos in a list within PlannedStmt, which holds the
PartitionPruneInfos for all subqueries.
Reviewed-by: Alvaro Herrera
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 86 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 14 ++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7042ca6c60..e6197c165e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -850,6 +850,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..ec730674f2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..67734979b0 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..6642d09a39 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1225,7 +1225,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1376,6 +1375,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1399,16 +1401,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1447,7 +1447,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1540,6 +1539,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1555,13 +1557,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df35d1ff9c..1b9071c774 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -547,6 +547,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..e2ea406c4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,48 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->root_parent_relids = offset_relid_set(pinfo->root_parent_relids,
+ rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1826,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1894,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..60fabb1734 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..12aacc84ff 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 88467977f8..22b928e085 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -636,6 +636,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 07e2415398..8d30b6e896 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..39d0281c23 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1431,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..c536a1fe19 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v55-0003-Initialize-PartitionPruneContext-for-exec-prunin.patch (11.8K, 3-v55-0003-Initialize-PartitionPruneContext-for-exec-prunin.patch)
download | inline diff:
From 92d87cdbb3ad675ac6ffa2767f1d7d5876bd5369 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 11:16:48 +0900
Subject: [PATCH v55 3/5] Initialize PartitionPruneContext for exec pruning
lazily
Currently, ExecInitPartitionPruning() iterates over PartitionPruningDatas
and nested PartitionedRelPruningDatas in a PartitionPruneState solely
to initialize the exec_context of the PartitionedRelPruningData.
This commit moves the initialization to find_matching_subplans_recurse(),
where the exec_context is actually needed, eliminating the need for
the above iteration. To track whether the context has been initialized
and is ready for use, a boolean field is_valid is added to
PartitionPruneContext.
---
src/backend/executor/execPartition.c | 166 ++++++++++-----------------
src/include/executor/execPartition.h | 1 +
src/include/partitioning/partprune.h | 2 +
3 files changed, 65 insertions(+), 104 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3c7c631867..d9fa593785 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -190,10 +190,8 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
-static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
- PartitionPruneState *prunestate,
- PlanState *planstate);
-static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
+static void find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans);
@@ -1830,13 +1828,14 @@ ExecInitPartitionPruning(PlanState *planstate,
/*
* ExecDoInitialPruning() must have initialized the PartitionPruneState to
- * perform the initial pruning. Now we simply need to initialize the
- * context information for exec pruning.
+ * perform the initial pruning. Store PlanState so that the exec_context
+ * can be initialized using it later when find_matching_subplans_recurse()
+ * needs it.
*/
prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
Assert(prunestate != NULL);
if (prunestate->do_exec_prune)
- PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+ prunestate->parent_plan = planstate;
/* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
@@ -1893,8 +1892,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* each PartitionedRelPruningData) for initial pruning here. Execution pruning
* requires access to the parent plan node's PlanState, which is not available
* when this function is called from ExecDoInitialPruning(), so it is
- * initialized later during ExecInitPartitionPruning() by calling
- * PartitionPruneInitExecPruning().
+ * initialized lazily during find_matching_subplans_recurse().
*/
PartitionPruneState *
ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
@@ -2099,25 +2097,30 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
}
/*
- * The exec pruning context will be initialized in
- * ExecInitPartitionPruning() when called during the initialization
- * of the parent plan node.
+ * The exec pruning context will be initialized lazily when it
+ * will be used for the first time in
+ * find_matching_subplans_recurse().
*
- * pprune->exec_pruning_steps is set to NIL to prevent
- * ExecFindMatchingSubPlans() from accessing an uninitialized
- * pprune->exec_context during the initial pruning by
- * ExecDoInitialPruning().
- *
- * prunestate->do_exec_prune is set to indicate whether
- * PartitionPruneInitExecPruning() needs to be called by
- * ExecInitPartitionPruning(). This optimization avoids
- * unnecessary cycles when only initial pruning is required.
+ * prunestate->do_exec_prune is set to indicate whether we're
+ * actually going to perform exec pruning to inform
+ * ExecInitPartitionPruning() whether it should fix the
+ * subplan_map array based on the result of initial pruning
+ * and also the parent node's code to allow it set up its
+ * data structure accordingly.
*/
- pprune->exec_pruning_steps = NIL;
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ pprune->exec_context.is_valid = false;
if (pinfo->exec_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
prunestate->do_exec_prune = true;
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+
j++;
}
i++;
@@ -2208,6 +2211,8 @@ InitPartitionPruneContext(PartitionPruneContext *context,
}
}
}
+
+ context->is_valid = true;
}
/*
@@ -2326,84 +2331,6 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
-/*
- * PartitionPruneInitExecPruning
- * Initialize PartitionPruneState for exec pruning.
- */
-static void
-PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
- PartitionPruneState *prunestate,
- PlanState *planstate)
-{
- EState *estate = planstate->state;
- int i;
- ExprContext *econtext;
-
- /* CreatePartitionPruneState() must have initialized. */
- Assert(estate->es_partition_directory != NULL);
-
- /* CreatePartitionPruneState() must have set this. */
- Assert(prunestate->do_exec_prune);
-
- /*
- * Create ExprContext if not already done for the planstate. We may need
- * an expression context to evaluate partition exprs.
- */
- ExecAssignExprContext(estate, planstate);
- econtext = planstate->ps_ExprContext;
- for (i = 0; i < prunestate->num_partprunedata; i++)
- {
- List *partrel_pruneinfos =
- list_nth_node(List, pruneinfo->prune_infos, i);
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
-
- for (j = 0; j < prunedata->num_partrelprunedata; j++)
- {
- PartitionedRelPruneInfo *pinfo =
- list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- Relation partrel = pprune->partrel;
- PartitionDesc partdesc;
- PartitionKey partkey;
-
- /*
- * Nothing to do if there are no exec pruning steps, but do set
- * pprune->exec_pruning_steps, becasue
- * find_matching_subplans_recurse() looks at it.
- *
- * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
- * values may be missing.
- */
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pprune->exec_pruning_steps == NIL ||
- (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- continue;
-
- /*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
- */
- partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
- InitPartitionPruneContext(&pprune->exec_context,
- pprune->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
-
- /*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
- */
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
- }
- }
-}
-
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
@@ -2449,12 +2376,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* recursing to other (lower-level) parents as needed.
*/
pprune = &prunedata->partrelprunedata[0];
- find_matching_subplans_recurse(prunedata, pprune, initial_prune,
+ find_matching_subplans_recurse(prunestate->parent_plan,
+ prunedata, pprune, initial_prune,
&result);
/* Expression eval may have used space in ExprContext too */
- if (pprune->exec_pruning_steps)
+ if (pprune->exec_context.is_valid)
+ {
+ Assert(pprune->exec_pruning_steps != NIL);
ResetExprContext(pprune->exec_context.exprcontext);
+ }
}
/* Add in any subplans that partition pruning didn't account for */
@@ -2477,7 +2408,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* Adds valid (non-prunable) subplan IDs to *validsubplans
*/
static void
-find_matching_subplans_recurse(PartitionPruningData *prunedata,
+find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans)
@@ -2497,8 +2429,33 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
partset = get_matching_partitions(&pprune->initial_context,
pprune->initial_pruning_steps);
else if (!initial_prune && pprune->exec_pruning_steps)
+ {
+ /* Initialize exec_context if not already done. */
+ if (unlikely(!pprune->exec_context.is_valid))
+ {
+ ExprContext *econtext;
+ EState *estate = parent_plan->state;
+ /* Must allocate the needed stuff in the query lifetime context. */
+ MemoryContext oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ Relation partrel = pprune->partrel;
+ PartitionKey partkey = RelationGetPartitionKey(partrel);
+ PartitionDesc partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+
+ if (parent_plan->ps_ExprContext == NULL)
+ ExecAssignExprContext(estate, parent_plan);
+ econtext = parent_plan->ps_ExprContext;
+
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, parent_plan,
+ econtext);
+
+ MemoryContextSwitchTo(oldcxt);
+ }
partset = get_matching_partitions(&pprune->exec_context,
pprune->exec_pruning_steps);
+ }
else
partset = pprune->present_parts;
@@ -2514,7 +2471,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
int partidx = pprune->subpart_map[i];
if (partidx >= 0)
- find_matching_subplans_recurse(prunedata,
+ find_matching_subplans_recurse(parent_plan,
+ prunedata,
&prunedata->partrelprunedata[partidx],
initial_prune, validsubplans);
else
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 2f45ac1cc8..ef6d8b2d48 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -122,6 +122,7 @@ typedef struct PartitionPruneState
bool do_initial_prune;
bool do_exec_prune;
int num_partprunedata;
+ PlanState *parent_plan;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index c536a1fe19..b7f48eefcc 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -26,6 +26,7 @@ struct RelOptInfo;
* Stores information needed at runtime for pruning computations
* related to a single partitioned table.
*
+ * is_valid Has the information in this struct been initialized?
* strategy Partition strategy, e.g. LIST, RANGE, HASH.
* partnatts Number of columns in the partition key.
* nparts Number of partitions in this partitioned table.
@@ -48,6 +49,7 @@ struct RelOptInfo;
*/
typedef struct PartitionPruneContext
{
+ bool is_valid;
char strategy;
int partnatts;
int nparts;
--
2.43.0
[application/octet-stream] v55-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch (17.3K, 4-v55-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 808126517d4b0018ee96de1ba28ea664566fd1aa Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 12 Sep 2024 15:44:43 +0900
Subject: [PATCH v55 2/5] Perform runtime initial pruning outside
ExecInitNode()
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() stores the PartitionPruneStates that it
creates to do the initial pruning to use during exec pruninng in a
list matching the length of es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing both lists to share
the same index. It also saves the initial pruning result -- a
bitmapset of indexes for surviving child subnodes -- in a similarly
indexed list.
While the initial pruning is done earlier, the execution pruning
context information (needed for runtime pruning) is initialized
later during ExecInitNode() for the parent plan node, as it requires
access to the parent node's PlanState struct.
---
src/backend/executor/execMain.c | 55 ++++++++
src/backend/executor/execPartition.c | 179 +++++++++++++++++++++------
src/include/executor/execPartition.h | 6 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 202 insertions(+), 40 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index e6197c165e..1994112b2e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -818,6 +819,54 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode()
+ * for plan nodes that support partition pruning.
+ *
+ * For each PartitionPruneInfo in estate->es_part_prune_infos, this function
+ * creates a PartitionPruneState (even if no initial pruning is done) and adds
+ * it to es_part_prune_states. For PartitionPruneInfo entries that include
+ * initial pruning steps, the result of those steps is saved as a bitmapset
+ * of indexes representing child subnodes that are "valid" and should be
+ * initialized for execution.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /*
+ * Create the working data structure for pruning, and save it for use
+ * later in ExecInitPartitionPruning(), which will be called by the
+ * parent plan node's ExecInit* function.
+ */
+ prunestate = ExecCreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform an initial partition pruning pass, if necessary, and save
+ * the bitmapset of valid subplans for use in
+ * ExecInitPartitionPruning(). If no initial pruning is performed, we
+ * still store a NULL to ensure that es_part_prune_results is the same
+ * length as es_part_prune_infos. This ensures that
+ * ExecInitPartitionPruning() can use the same index to locate the
+ * result.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -850,7 +899,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to determine the plan nodes that will
+ * not be executed.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ec730674f2..3c7c631867 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -192,6 +190,9 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
+static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1783,20 +1784,26 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
/*
* ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'root_parent_relids' identifies the relation to which both the parent plan
- * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ * and the PartitionPruneInfo associated with 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
+ * Here, we initialize only the PartitionPruneContext necessary for execution
+ * pruning.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized alongside the parent plan node. Initial
+ * pruning would have been performed by ExecDoInitialPruning() if necessary,
+ * and the bitmapset of surviving subplans' indexes would have been stored as
+ * the part_prune_index'th element of EState.es_part_prune_results.
+ *
+ * If subplans are pruned, the subplan_map arrays in the returned
+ * PartitionPruneState are re-sequenced to exclude those subplans, but only if
+ * the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1821,17 +1828,21 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(root_parent_relids),
bmsToString(pruneinfo->root_parent_relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
-
/*
- * Perform an initial partition prune pass, if required.
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning. Now we simply need to initialize the
+ * context information for exec pruning.
*/
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
+ if (prunestate->do_exec_prune)
+ PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1877,16 +1888,23 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that we only initialize the PartitionPruneContext (which is placed into
+ * each PartitionedRelPruningData) for initial pruning here. Execution pruning
+ * requires access to the parent plan node's PlanState, which is not available
+ * when this function is called from ExecDoInitialPruning(), so it is
+ * initialized later during ExecInitPartitionPruning() by calling
+ * PartitionPruneInitExecPruning().
*/
-static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+ /* We may need an expression context to evaluate partition exprs */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1974,6 +1992,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* set to -1, as if they were pruned. By construction, both
* arrays are in partition bounds order.
*/
+ pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
@@ -2073,29 +2092,31 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
+ partdesc, partkey, NULL,
econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps &&
- !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
- /* Record whether exec pruning is needed at any level */
- prunestate->do_exec_prune = true;
- }
/*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
+ * The exec pruning context will be initialized in
+ * ExecInitPartitionPruning() when called during the initialization
+ * of the parent plan node.
+ *
+ * pprune->exec_pruning_steps is set to NIL to prevent
+ * ExecFindMatchingSubPlans() from accessing an uninitialized
+ * pprune->exec_context during the initial pruning by
+ * ExecDoInitialPruning().
+ *
+ * prunestate->do_exec_prune is set to indicate whether
+ * PartitionPruneInitExecPruning() needs to be called by
+ * ExecInitPartitionPruning(). This optimization avoids
+ * unnecessary cycles when only initial pruning is required.
*/
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
+ pprune->exec_pruning_steps = NIL;
+ if (pinfo->exec_pruning_steps &&
+ !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ prunestate->do_exec_prune = true;
j++;
}
@@ -2305,6 +2326,84 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
+/*
+ * PartitionPruneInitExecPruning
+ * Initialize PartitionPruneState for exec pruning.
+ */
+static void
+PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate)
+{
+ EState *estate = planstate->state;
+ int i;
+ ExprContext *econtext;
+
+ /* CreatePartitionPruneState() must have initialized. */
+ Assert(estate->es_partition_directory != NULL);
+
+ /* CreatePartitionPruneState() must have set this. */
+ Assert(prunestate->do_exec_prune);
+
+ /*
+ * Create ExprContext if not already done for the planstate. We may need
+ * an expression context to evaluate partition exprs.
+ */
+ ExecAssignExprContext(estate, planstate);
+ econtext = planstate->ps_ExprContext;
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ List *partrel_pruneinfos =
+ list_nth_node(List, pruneinfo->prune_infos, i);
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
+
+ for (j = 0; j < prunedata->num_partrelprunedata; j++)
+ {
+ PartitionedRelPruneInfo *pinfo =
+ list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ Relation partrel = pprune->partrel;
+ PartitionDesc partdesc;
+ PartitionKey partkey;
+
+ /*
+ * Nothing to do if there are no exec pruning steps, but do set
+ * pprune->exec_pruning_steps, becasue
+ * find_matching_subplans_recurse() looks at it.
+ *
+ * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
+ * values may be missing.
+ */
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ if (pprune->exec_pruning_steps == NIL ||
+ (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ continue;
+
+ /*
+ * We can rely on the copies of the partitioned table's partition
+ * key and partition descriptor appearing in its relcache entry,
+ * because that entry will be held open and locked for the
+ * duration of this executor run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, planstate,
+ econtext);
+
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+ }
+ }
+}
+
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 12aacc84ff..2f45ac1cc8 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,9 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * partrel Partitioned table; points to
+ * EState.es_relations[rti-1], where rti is the
+ * table's RT index
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -58,6 +61,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
@@ -128,4 +132,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 22b928e085..518a9fcd15 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -637,6 +637,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/octet-stream] v55-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch (45.1K, 5-v55-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From ad047f0bb7b703c0d2079464622588138e64b117 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 12:00:41 +0900
Subject: [PATCH v55 4/5] Defer locking of runtime-prunable relations to
executor
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When preparing a cached plan for execution, plancache.c locks the
relations in the plan's range table to ensure they are safe for
execution. However, this approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations
that might be pruned during "initial" runtime pruning.
To optimize this, locking is now deferred for relations subject to
"initial" runtime pruning. The planner now provides a set of
"unprunable" relations through the new PlannedStmt.unprunableRelids
field. AcquireExecutorLocks() will only lock these unprunable
relations. PlannedStmt.unprunableRelids is populated by subtracting
the set of initially prunable relids from all RT indexes. The prunable
relids are identified by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
Deferred locks are taken, if necessary, after ExecDoInitialPruning()
determines the set of unpruned partitions. To allow the executor to
determine whether the plan tree it’s executing is cached and may
contain unlocked relations, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain unlocked relations.
Plan nodes like Append have already been updated to consider only the
set of unpruned relations. However, there are cases, such as child
RowMarks and child result relations, where the code manipulating those
do not directly receive information about unpruned partitions.
Therefore, code handling child RowMarks and result relations has been
modified to ensure they don’t belong to pruned partitions. For this,
the RT indexes of unpruned partitions are added in
ExecDoInitialPruning() to es_unprunable_relids, which initially
contains PlannedStmt.unprunableRelids. The corresponding code now
processes only those child RowMarks and result relations whose owning
relations are in this set. For result relations managed by a
ModifyTable node, its resultRelations list is truncated in
ExecInitModifyTable to only consider unpruned relations and the
ResultRelInfo structs are created only for those.
Finally, an Assert has also been added in ExecCheckPermissions() to
ensure that all relations whose permissions are checked have been
properly locked, helping to catch any accidental omission of relations
from the unprunableRelids set that should have their permissions
checked.
This deferment introduces a window where prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. Consequently, the executor might attempt to execute an
invalid plan, leading to errors such as failing to locate the index
of an unpruned partition that may have been dropped concurrently
during ExecInitIndexScan() (if it's partition-local, not inherited,
for example). Future commits will introduce changes to enable the
executor to check plan validity during ExecutorStart() and retry with
a newly created plan if the original becomes invalid after taking
deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 75 ++++++++++++++++++++++++--
src/backend/executor/execParallel.c | 9 +++-
src/backend/executor/execPartition.c | 36 ++++++++++---
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 +--
src/backend/executor/nodeLockRows.c | 10 +++-
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/nodeModifyTable.c | 38 ++++++++++---
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 7 +++
src/backend/partitioning/partprune.c | 18 +++++++
src/backend/tcop/pquery.c | 10 +++-
src/backend/utils/cache/plancache.c | 40 ++++++++------
src/include/commands/explain.h | 5 +-
src/include/executor/execPartition.h | 5 +-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 6 +++
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 7 +++
src/include/utils/plancache.h | 10 ++++
27 files changed, 263 insertions(+), 52 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index aaec439892..49f7370734 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -617,7 +617,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -673,7 +674,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..bd169edeff 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -742,6 +742,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 1994112b2e..df1b5b2dc3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -53,6 +53,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -90,6 +91,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -600,6 +602,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -862,12 +879,46 @@ ExecDoInitialPruning(EState *estate)
* result.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ Bitmapset *validsubplan_rtis = NULL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &validsubplan_rtis);
+ if (ExecShouldLockRelations(estate))
+ {
+ int rtindex = -1;
+
+ rtindex = -1;
+ while ((rtindex = bms_next_member(validsubplan_rtis,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
+
+ Assert(rte->rtekind == RTE_RELATION &&
+ rte->rellockmode != NoLock);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ estate->es_unprunable_relids = bms_add_members(estate->es_unprunable_relids,
+ validsubplan_rtis);
+ }
+
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/* ----------------------------------------------------------------
* InitPlan
*
@@ -880,6 +931,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -899,10 +951,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
+ estate->es_unprunable_relids = bms_copy(plannedstmt->unprunableRelids);
/*
* Perform runtime "initial" pruning to determine the plan nodes that will
- * not be executed.
+ * not be executed. This will also add the RT indexes of surviving leaf
+ * partitions to es_unprunable_relids.
*/
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
@@ -921,8 +976,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
Relation relation;
ExecRowMark *erm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* get relation's OID (will produce InvalidOid if subquery) */
@@ -2959,6 +3019,13 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
}
}
+ /*
+ * Copy es_unprunable_relids so that RowMarks of pruned relations are
+ * ignored in ExecInitLockRows() and ExecInitModifyTable() when
+ * initializing the plan trees below.
+ */
+ rcestate->es_unprunable_relids = parentestate->es_unprunable_relids;
+
/*
* Initialize private state information for each SubPlan. We must do this
* before running ExecInitNode on the main query tree, since
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d9fa593785..551e0ce9b2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -194,7 +195,8 @@ static void find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis);
/*
@@ -1978,8 +1980,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* The set of partitions that exist now might not be the same that
* existed when the plan was made. The normal case is that it is;
* optimize for that case with a quick comparison, and just copy
- * the subplan_map and make subpart_map point to the one in
- * PruneInfo.
+ * the subplan_map and make subpart_map, rti_map point to the
+ * ones in PruneInfo.
*
* For the case where they aren't identical, we could have more
* partitions on either side; or even exactly the same number of
@@ -1999,6 +2001,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
sizeof(int) * partdesc->nparts) == 0)
{
pprune->subpart_map = pinfo->subpart_map;
+ pprune->rti_map = pinfo->rti_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
}
@@ -2019,6 +2022,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* mismatches.
*/
pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(int) * partdesc->nparts);
for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
{
@@ -2036,6 +2040,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
continue;
}
@@ -2073,6 +2079,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map[pp_idx] = -1;
pprune->subplan_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2339,10 +2346,13 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * valisubplan_rtis must be non-NULL if initial_pruning is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2378,7 +2388,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunestate->parent_plan,
prunedata, pprune, initial_prune,
- &result);
+ &result, validsubplan_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_context.is_valid)
@@ -2395,6 +2405,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_copy(*validsubplan_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2405,14 +2417,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and the RT indexes
+ * of their owning leaf partitions to *validsubplan_rtis if it's non-NULL.
*/
static void
find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *partset;
int i;
@@ -2464,8 +2478,13 @@ find_matching_subplans_recurse(PlanState *parent_plan,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_add_member(*validsubplan_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2474,7 +2493,8 @@ find_matching_subplans_recurse(PlanState *parent_plan,
find_matching_subplans_recurse(parent_plan,
prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ validsubplan_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..b5b2cd53c5 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -28,6 +28,7 @@
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "utils/rel.h"
+#include "utils/lsyscache.h"
/* ----------------------------------------------------------------
@@ -347,8 +348,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8bf4c80d4a..3c02782445 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4176,12 +4176,17 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
hash_search(node->mt_resultOidHash, &resultoid, HASH_FIND, NULL);
if (mtlookup)
{
+ ResultRelInfo *resultRelInfo;
+
if (update_cache)
{
node->mt_lastResultOid = resultoid;
node->mt_lastResultIndex = mtlookup->relationIndex;
}
- return node->resultRelInfo + mtlookup->relationIndex;
+
+ resultRelInfo = node->resultRelInfo + mtlookup->relationIndex;
+
+ return resultRelInfo;
}
}
else
@@ -4218,7 +4223,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ModifyTableState *mtstate;
Plan *subplan = outerPlan(node);
CmdType operation = node->operation;
- int nrels = list_length(node->resultRelations);
+ int nrels;
+ List *resultRelations = NIL;
ResultRelInfo *resultRelInfo;
List *arowmarks;
ListCell *l;
@@ -4228,6 +4234,20 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
+ /*
+ * Only consider unpruned relations. In the future, it might be more
+ * efficient to store resultRelations as a bitmapset, which would make
+ * this operation cheaper.
+ */
+ foreach(l, node->resultRelations)
+ {
+ Index rti = lfirst_int(l);
+
+ if (bms_is_member(rti, estate->es_unprunable_relids))
+ resultRelations = lappend_int(resultRelations, rti);
+ }
+ nrels = list_length(resultRelations);
+
/*
* create state structure
*/
@@ -4265,6 +4285,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
if (node->rootRelation > 0)
{
+ Assert(bms_is_member(node->rootRelation, estate->es_unprunable_relids));
mtstate->rootResultRelInfo = makeNode(ResultRelInfo);
ExecInitResultRelation(estate, mtstate->rootResultRelInfo,
node->rootRelation);
@@ -4279,7 +4300,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
- node->epqParam, node->resultRelations);
+ node->epqParam, resultRelations);
mtstate->fireBSTriggers = true;
/*
@@ -4297,7 +4318,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
resultRelInfo = mtstate->resultRelInfo;
i = 0;
- foreach(l, node->resultRelations)
+ foreach(l, resultRelations)
{
Index resultRelation = lfirst_int(l);
List *mergeActions = NIL;
@@ -4589,8 +4610,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* Find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 90d9834576..659bd6dcd9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 1b9071c774..9e47a7fd50 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -549,6 +549,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e2ea406c4e..283a61a972 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1764,8 +1764,15 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ int i;
prelinfo->rtindex += rtoffset;
+ for (i = 0; i < prelinfo->nparts; i++)
+ {
+ prelinfo->rti_map[i] += rtoffset;
+ glob->prunableRelids = bms_add_member(glob->prunableRelids,
+ prelinfo->rti_map[i]);
+ }
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 60fabb1734..85894c87af 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -645,6 +645,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ int *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -657,6 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (int *) palloc0(nparts * sizeof(int));
present_parts = NULL;
i = -1;
@@ -671,9 +673,24 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ rti_map[i] = (int) partrel->relid;
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -695,6 +712,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ef6d8b2d48..7f2592e3b0 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -48,6 +48,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map RT index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -65,6 +66,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ int *rti_map pg_node_attr(array_size(nparts));
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -132,7 +134,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis);
extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 518a9fcd15..57170818c0 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -636,9 +637,14 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
+ Bitmapset *es_unprunable_relids; /* PlannedStmt.unprunableRelids + RT
+ * indexes of leaf partitions that
+ * survive initial pruning; see
+ * ExecDoInitialPruning() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8d30b6e896..cc2190ea63 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 39d0281c23..318e30fe2f 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1474,6 +1478,9 @@ typedef struct PartitionedRelPruneInfo
/* subpart index by partition index, or -1 */
int *subpart_map pg_node_attr(array_size(nparts));
+ /* RT index by partition index, or 0 */
+ int *rti_map pg_node_attr(array_size(nparts));
+
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
[application/octet-stream] v55-0005-Handle-CachedPlan-invalidation-in-the-executor.patch (58.0K, 6-v55-0005-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 24eea4f10fa7129bc6284a7317d413bed2b177b5 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v55 5/5] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 84 ++++++-
src/backend/executor/execUtils.c | 3 +-
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 17 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 ++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 ++++-
.../expected/cached-plan-inval.out | 230 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 75 ++++++
26 files changed, 814 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3c72e437f7..76642b557a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -985,6 +985,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 49f7370734..b7a0b8c05b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -618,6 +619,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -688,8 +690,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 29d30bfb6f..e33b8f573b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5120,6 +5120,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index df1b5b2dc3..df117e9477 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -59,6 +59,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -137,6 +138,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -320,6 +375,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -426,8 +482,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -486,11 +545,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -504,6 +562,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -962,6 +1028,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2948,6 +3017,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 67734979b0..435ae0df7a 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -757,7 +758,7 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
*/
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 659bd6dcd9..f84f376c9c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e394f1419a..b95c859655 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2039,7 +2040,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..dbb0ffb771 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when cplan_release is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..5bc0edb5a0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +605,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 57170818c0..f50b6b50a8 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -690,6 +690,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e002cfbc9c
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,230 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(26 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(16 rows)
+
+
+starting permutation: s1prep4 s2lock s1exec4 s2dropi s2unlock
+step s1prep4: SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1);
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 2
+ -> Append
+ Disabled Nodes: 2
+ Subplans Removed: 2
+ -> Index Scan using foo12_1_a on foo12_1 foo_1
+ Index Cond: (a = $1)
+ -> Function Scan on generate_series
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec4: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec4: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 3
+ -> Append
+ Disabled Nodes: 3
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Disabled Nodes: 1
+ Filter: (a = $1)
+ -> Function Scan on generate_series
+(12 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..820a843051
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,75 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Another case with Append with run-time pruning in a subquery
+step "s1prep4" { SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+step "s1exec4" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
+permutation "s1prep4" "s2lock" "s1exec4" "s2dropi" "s2unlock"
--
2.43.0
view thread (29+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: generic plans and "initial" pruning
In-Reply-To: <CA+HiwqFGz2uShfU=qtack9gii6Kzyjv1V66tJJBYBN8Acb4uTA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox